Matthew Honnibal
d5a6c63b62
Add regression test for #2482
2018-09-28 15:18:30 +02:00
Matthew Honnibal
e3e9fe18d4
Merge branch 'develop' of https://github.com/explosion/spaCy into develop
2018-09-28 14:27:35 +02:00
Matthew Honnibal
0323f5be0c
Fix _serialize module
2018-09-28 14:27:24 +02:00
Matthew Honnibal
05b6103a0c
Try to fix version pin for msgpack-numpy
2018-09-28 14:07:00 +02:00
Ines Montani
5d56eb70d7
Tidy up tests
2018-09-27 16:41:57 +02:00
Ines Montani
1f1bab9264
Remove unused import
2018-09-27 16:41:37 +02:00
Matthew Honnibal
6430b1fe64
Restore encoding arg on msgpack-numpy
2018-09-27 15:58:21 +02:00
Matthew Honnibal
276aa83d1a
Require older msgpack-numpy
2018-09-27 15:34:24 +02:00
Matthew Honnibal
2ac69facc6
Fix Python 2 test failure
2018-09-27 15:34:16 +02:00
Matthew Honnibal
72778375fb
Merge branch 'master' of https://github.com/explosion/spaCy
2018-09-27 13:54:49 +02:00
Matthew Honnibal
96fe314d8d
Fix bug when too many entity types. Fixes #2800
2018-09-27 13:54:34 +02:00
Suraj Rajan
bbdc6456c6
Set up dependency tree pattern matching skeleton ( #2732 )
2018-09-27 13:27:18 +02:00
Matthew Honnibal
8809dc4514
Remove deprecated encoding argument to msgpack
2018-09-27 12:56:23 +02:00
Matthew Honnibal
bae6b3e2b3
Merge branch 'master' of https://github.com/explosion/spaCy
2018-09-27 12:50:31 +02:00
Ines Montani
71cdbeada7
Revert "Also include lowercase norm exceptions"
...
This reverts commit 70f4e8adf3
.
2018-09-27 12:29:25 +02:00
Charles-Axel Dein
014dd47c70
Add jupyter=True to displacy.render in documentation ( #2806 )
2018-09-27 12:28:04 +02:00
Przemysław Hojnacki
966b583d5e
agreement of contributor, may I introduce a tiny pl languge contribution ( #2799 )
...
* Contributors agreement
* Contributors agreement
* Contributors agreement
2018-09-27 12:25:22 +02:00
Charles-Axel Dein
94ad3c55f1
Add charlax's contributor agreement ( #2805 )
2018-09-27 12:24:42 +02:00
darindf
8227566805
Fix error ( #2802 )
...
* Fix error
ValueError: cannot resize an array that references or is referenced
by another array in this way. Use the resize function
* added spaCy Contributor Agreement
2018-09-26 21:31:03 +02:00
Ines Montani
5e0dfb34fa
Merge branch 'master' of https://github.com/explosion/spaCy
2018-09-26 11:13:58 +02:00
Ines Montani
70f4e8adf3
Also include lowercase norm exceptions
2018-09-25 12:22:02 +02:00
Keshan
9a016d17c2
Adding basic support for Sinhala language. ( #2788 )
...
* adding Sinhala language package, stop words, examples and lex_attrs.
* Adding contributor agreement
* Updating contributor agreement
2018-09-25 12:18:25 +02:00
Pranshu Jethmalani
9fd27d777e
Fix typo ( #2795 ) [ci skip]
...
Fixed typo on line 6 "regcognizer --> recognizer"
2018-09-25 12:12:40 +02:00
Matthew Honnibal
b42c123e5d
Fix regression introduced by 1759abf1e
2018-09-25 11:08:58 +02:00
Matthew Honnibal
500898907b
Fix regression in parser.begin_training()
2018-09-25 11:08:31 +02:00
Ines Montani
3c4e3ade30
Fix typo ( closes #2784 )
2018-09-21 10:45:11 +02:00
mauryaland
68b3c544d5
Adding French hyphenated first name ( #2786 )
2018-09-21 10:38:13 +02:00
Matthew Honnibal
1759abf1e5
Fix bug in sentence starts for non-projective parses
...
The set_children_from_heads function assumed parse trees were
projective. However, non-projective parses may be passed in during
deserialization, or after deprojectivising. This caused incorrect
sentence boundaries to be set for non-projective parses. Close #2772 .
2018-09-19 14:50:06 +02:00
Matthew Honnibal
48fd36bf05
Fix test for issue 27772
2018-09-19 14:47:27 +02:00
Matthew Honnibal
6cd920e088
Add xfail test for deprojectivization SBD bug
2018-09-19 14:00:31 +02:00
John Stewart
2d15859d2a
Fixed spaCy+Keras example ( #2763 )
...
* bug fixes in keras example
* created contributor agreement
2018-09-15 13:06:39 +02:00
Matthew Honnibal
99a6011580
Avoid adding empty layer in model, to keep models backwards compatible
2018-09-14 22:51:58 +02:00
Matthew Honnibal
c046392317
Trigger on_data hooks in parser model
2018-09-14 20:51:21 +02:00
Matthew Honnibal
5afd98dff5
Add a stepping function, for changing batch sizes or learning rates
2018-09-14 18:37:16 +02:00
Matthew Honnibal
27c00f4f22
Merge branch 'develop' of https://github.com/explosion/spaCy into develop
2018-09-14 12:30:57 +02:00
Andrew Ongko
81564cc4e8
Update Indonesian model ( #2752 )
...
* adding e-KTP in tokenizer exceptions list
* add exception token
* removing lines with containing space as it won't matter since we use .split() method in the end, added new tokens in exception
* add tokenizer exceptions list
* combining base_norms with norm_exceptions
* adding norm_exception
* fix double key in lemmatizer
* remove unused import on punctuation.py
* reformat stop_words to reduce number of lines, improve readibility
* updating tokenizer exception
* implement is_currency for lang/id
* adding orth_first_upper in tokenizer_exceptions
* update the norm_exception list
* remove bunch of abbreviations
* adding contributors file
2018-09-14 12:30:32 +02:00
Filipe Caixeta
fe515085f3
Add words to portuguese language _num_words ( #2759 )
...
* Add words to portuguese language _num_words
* Add words to portuguese language _num_words
2018-09-14 12:30:16 +02:00
Matthew Honnibal
f32b52e611
Fix bug that caused deprojectivisation to run multiple times
2018-09-14 12:12:54 +02:00
Matthew Honnibal
8f2a6367e9
Fix usage of PyTorch BiLSTM in ud_train
2018-09-13 22:54:59 +00:00
Matthew Honnibal
afeddfff26
Fix PyTorch BiLSTM
2018-09-13 22:54:34 +00:00
Matthew Honnibal
a26fe8e7bb
Small hack in Language.update to make torch work
2018-09-13 22:51:52 +00:00
Matthew Honnibal
445b81ce3f
Support bilstm_depth argument in ud-train
2018-09-13 19:30:22 +02:00
Matthew Honnibal
b43643a953
Support bilstm_depth option in parser
2018-09-13 19:29:49 +02:00
Matthew Honnibal
45032fe9e1
Support option of BiLSTM in Tok2Vec (requires pytorch)
2018-09-13 19:28:35 +02:00
Matthew Honnibal
3eb9f3e2b8
Fix defaults for ud-train
2018-09-13 18:05:48 +02:00
Matthew Honnibal
59cf533879
Improve ud-train script. Make config optional
2018-09-13 14:24:08 +02:00
Matthew Honnibal
3e3a309764
Fix tagger
2018-09-13 14:14:38 +02:00
Matthew Honnibal
da7650e84b
Fix maximum doc length in ud_train script
2018-09-13 14:10:25 +02:00
Matthew Honnibal
a95eea4c06
Fix multi-task objective for parser
2018-09-13 14:08:55 +02:00
Matthew Honnibal
21321cd6cf
Add tok2vec property to parser model
2018-09-13 14:08:43 +02:00