Matthew Honnibal
1759abf1e5
Fix bug in sentence starts for non-projective parses
...
The set_children_from_heads function assumed parse trees were
projective. However, non-projective parses may be passed in during
deserialization, or after deprojectivising. This caused incorrect
sentence boundaries to be set for non-projective parses. Close #2772 .
2018-09-19 14:50:06 +02:00
Matthew Honnibal
48fd36bf05
Fix test for issue 27772
2018-09-19 14:47:27 +02:00
Matthew Honnibal
6cd920e088
Add xfail test for deprojectivization SBD bug
2018-09-19 14:00:31 +02:00
Matthew Honnibal
99a6011580
Avoid adding empty layer in model, to keep models backwards compatible
2018-09-14 22:51:58 +02:00
Matthew Honnibal
c046392317
Trigger on_data hooks in parser model
2018-09-14 20:51:21 +02:00
Matthew Honnibal
5afd98dff5
Add a stepping function, for changing batch sizes or learning rates
2018-09-14 18:37:16 +02:00
Matthew Honnibal
27c00f4f22
Merge branch 'develop' of https://github.com/explosion/spaCy into develop
2018-09-14 12:30:57 +02:00
Matthew Honnibal
f32b52e611
Fix bug that caused deprojectivisation to run multiple times
2018-09-14 12:12:54 +02:00
Matthew Honnibal
8f2a6367e9
Fix usage of PyTorch BiLSTM in ud_train
2018-09-13 22:54:59 +00:00
Matthew Honnibal
afeddfff26
Fix PyTorch BiLSTM
2018-09-13 22:54:34 +00:00
Matthew Honnibal
a26fe8e7bb
Small hack in Language.update to make torch work
2018-09-13 22:51:52 +00:00
Matthew Honnibal
445b81ce3f
Support bilstm_depth argument in ud-train
2018-09-13 19:30:22 +02:00
Matthew Honnibal
b43643a953
Support bilstm_depth option in parser
2018-09-13 19:29:49 +02:00
Matthew Honnibal
45032fe9e1
Support option of BiLSTM in Tok2Vec (requires pytorch)
2018-09-13 19:28:35 +02:00
Matthew Honnibal
3eb9f3e2b8
Fix defaults for ud-train
2018-09-13 18:05:48 +02:00
Matthew Honnibal
59cf533879
Improve ud-train script. Make config optional
2018-09-13 14:24:08 +02:00
Matthew Honnibal
3e3a309764
Fix tagger
2018-09-13 14:14:38 +02:00
Matthew Honnibal
da7650e84b
Fix maximum doc length in ud_train script
2018-09-13 14:10:25 +02:00
Matthew Honnibal
a95eea4c06
Fix multi-task objective for parser
2018-09-13 14:08:55 +02:00
Matthew Honnibal
21321cd6cf
Add tok2vec property to parser model
2018-09-13 14:08:43 +02:00
Matthew Honnibal
d6aa60139d
Fix tagger training on GPU
2018-09-13 14:05:37 +02:00
Matthew Honnibal
b2cb1fc67d
Merge matcher tests
2018-09-06 01:39:53 +02:00
Suraj Krishnan Rajan
356af7b0a1
Fix tests
2018-09-06 01:39:36 +02:00
Matthew Honnibal
4d2d7d5866
Fix new feature flags
2018-08-27 02:12:39 +02:00
Matthew Honnibal
598dbf1ce0
Fix character-based tokenization for Japanese
2018-08-27 01:51:38 +02:00
Matthew Honnibal
3763e20afc
Pass subword_features and conv_depth params
2018-08-27 01:51:15 +02:00
Matthew Honnibal
8051136d70
Support subword_features and conv_depth params in Tok2Vec
2018-08-27 01:50:48 +02:00
Matthew Honnibal
9c33d4d1df
Add more hyper-parameters to spacy ud-train
...
* subword_features: Controls whether subword features are used in the
word embeddings. True by default (specifically, prefix, suffix and word
shape). Should be set to False for languages like Chinese and Japanese.
* conv_depth: Depth of the convolutional layers. Defaults to 4.
2018-08-27 01:48:46 +02:00
Matthew Honnibal
51a9efbf3b
Add draft Binder class
2018-08-22 13:12:51 +02:00
Matthew Honnibal
f0e6be689a
Merge branch 'develop' of https://github.com/explosion/spaCy into develop
2018-08-16 17:18:19 +02:00
Matthew Honnibal
5ce459d2ee
Fix error in vocab
2018-08-16 17:18:09 +02:00
Ines Montani
aeb49eb625
Update version [ci skip]
2018-08-16 16:56:02 +02:00
Ines Montani
a0eacd3293
Merge branch 'master' into develop
2018-08-16 16:55:05 +02:00
Ines Montani
c0fa9903f4
Update model directory JS [ci skip]
...
Prevent the default release URL from being overwritten and add license type
2018-08-16 16:54:50 +02:00
Ines Montani
03f661fefb
Add Greek to models directory [ci skip]
2018-08-16 16:51:56 +02:00
Matthew Honnibal
00febda2e3
Improve alignment around quotes
2018-08-16 01:04:34 +02:00
Matthew Honnibal
66a3f2ba21
Lower-case text before alignment
2018-08-16 00:42:36 +02:00
Matthew Honnibal
595c893791
Expose noise_level option in train CLI
2018-08-16 00:41:44 +02:00
Matthew Honnibal
8365226bf3
Fix lookup of symbols in vocab.
2018-08-15 23:43:34 +02:00
Matthew Honnibal
b9f0588580
Set version to v2.1.0a1
2018-08-15 17:22:39 +02:00
Matthew Honnibal
e968016417
Note link between issues #2671 and #2675
2018-08-15 17:18:28 +02:00
Matthew Honnibal
63bdc734ba
Skip flakey test
2018-08-15 16:56:55 +02:00
Matthew Honnibal
ce512e1d47
Fix #2671 : Incorrect match ID on some patterns
2018-08-15 16:19:08 +02:00
Matthew Honnibal
f12b9190f6
Xfail test for issue #2671
2018-08-15 15:55:31 +02:00
Matthew Honnibal
7cfa665ce6
Add failing test for issue 2671: Incorrect rule ID returned from matcher
2018-08-15 15:54:33 +02:00
Matthew Honnibal
1b2a5869ab
Set version to v2.1.0a2.dev0
2018-08-15 15:38:52 +02:00
Matthew Honnibal
5080760288
Add extra comment on 'add label' in parser
2018-08-15 15:37:24 +02:00
Matthew Honnibal
6e749d3c70
Skip flakey parser test
2018-08-15 15:37:04 +02:00
Ines Montani
fd9d175a53
Update live code [ci skip]
2018-08-15 15:28:48 +02:00
Matthew Honnibal
48ed1ca29d
Add branch option to push-tag script
2018-08-15 03:16:43 +02:00