Matthew Honnibal
9bf6e93b3e
Set pretrained_vectors in begin_training
2018-03-28 16:32:41 +02:00
Matthew Honnibal
95a9615221
Fix loading of multiple pre-trained vectors
...
This patch addresses #1660 , which was caused by keying all pre-trained
vectors with the same ID when telling Thinc how to refer to them. This
meant that if multiple models were loaded that had pre-trained vectors,
errors or incorrect behaviour resulted.
The vectors class now includes a .name attribute, which defaults to:
{nlp.meta['lang']_nlp.meta['name']}.vectors
The vectors name is set in the cfg of the pipeline components under the
key pretrained_vectors. This replaces the previous cfg key
pretrained_dims.
In order to make existing models compatible with this change, we check
for the pretrained_dims key when loading models in from_disk and
from_bytes, and add the cfg key pretrained_vectors if we find it.
2018-03-28 16:02:59 +02:00
ines
07b8c255a5
Updatee example with note to install requests
2018-03-28 12:46:27 +02:00
ines
366c98a94b
Remove requests dependency
2018-03-28 12:46:18 +02:00
ines
7fbc9e5874
Replace requests with urllib
2018-03-28 12:46:07 +02:00
ines
da1f200362
Add compat helpers for urllib
2018-03-28 12:45:53 +02:00
ines
ac88c72c9a
Fix ftfy workaround and remove old import
2018-03-28 12:14:28 +02:00
ines
ce6071ca89
Remove ftfy dependency and update docs
2018-03-28 12:09:42 +02:00
Matthew Honnibal
070b6c6495
Remove dependency on ftfy
2018-03-28 12:07:02 +02:00
ines
6d2c85f428
Drop six and related hacks as a dependency
2018-03-28 10:45:25 +02:00
ines
9e83513004
Add position of invalid token to error message
2018-03-27 23:56:59 +02:00
ines
11c4735ccf
Fix issue in Italian lemmatizer data ( resolves #2050 )
2018-03-27 23:55:22 +02:00
Matthew Honnibal
6a961928b2
Merge branch 'develop' of https://github.com/explosion/spaCy into develop
2018-03-27 21:01:48 +00:00
Matthew Honnibal
b7136cb094
Support zipped vector files in init-model
2018-03-27 21:01:18 +00:00
ines
693971dd8f
Improve error message if token text is empty string (see #2101 )
2018-03-27 22:25:40 +02:00
ines
0c829e6605
Fix whitespace
2018-03-27 22:20:59 +02:00
Matthew Honnibal
de9fd091ac
Fix #2014 : token.pos_ not writeable
2018-03-27 21:21:11 +02:00
Matthew Honnibal
18da89e04c
Handle non-callable gold_tuples in parser begin_training
2018-03-27 21:08:41 +02:00
Matthew Honnibal
1f7229f40f
Revert "Merge branch 'develop' of https://github.com/explosion/spaCy into develop"
...
This reverts commit c9ba3d3c2d
, reversing
changes made to 92c26a35d4
.
2018-03-27 19:23:02 +02:00
Matthew Honnibal
8b7a74570f
Revert "Revert "Merge branch 'develop' of https://github.com/explosion/spaCy into develop""
...
This reverts commit f41e626844
.
2018-03-27 19:22:52 +02:00
Matthew Honnibal
f41e626844
Revert "Merge branch 'develop' of https://github.com/explosion/spaCy into develop"
...
This reverts commit c9ba3d3c2d
, reversing
changes made to f57bfbccdc
.
2018-03-27 19:22:25 +02:00
Matthew Honnibal
c9ba3d3c2d
Merge branch 'develop' of https://github.com/explosion/spaCy into develop
2018-03-27 18:59:08 +02:00
Matthew Honnibal
92c26a35d4
Update get_cuda_stream
2018-03-27 16:42:00 +00:00
Ines Montani
e0ae390607
Update CONTRIBUTING.md
2018-03-27 13:47:00 +02:00
Matthew Honnibal
f57bfbccdc
Fix non-projective label filtering
2018-03-27 13:41:33 +02:00
Matthew Honnibal
d2118792e7
Merge changes from master
2018-03-27 13:38:41 +02:00
Matthew Honnibal
d4680e4d83
Merge branch 'master' of https://github.com/explosion/spaCy
2018-03-27 13:36:37 +02:00
Matthew Honnibal
63a267b34d
Fix #2073 : Token.set_extension not working
2018-03-27 13:36:20 +02:00
Ines Montani
284bbb1dd1
Merge pull request #2146 from justindujardin/tensorboard-standalone-example
...
Add example using TensorBoard standalone projector
2018-03-27 13:23:32 +02:00
Matthew Honnibal
25280b7013
Try to make sum_state_features faster
2018-03-27 10:08:38 +00:00
Matthew Honnibal
987e1533a4
Use 8 features in parser
2018-03-27 10:08:12 +00:00
Matthew Honnibal
8bbd26579c
Support GPU in UD training script
2018-03-27 09:53:35 +00:00
Matthew Honnibal
dd54511c4f
Pass data as a function in begin_training methods
2018-03-27 09:39:59 +00:00
Matthew Honnibal
d9ebd78e11
Change default sizes in parser
2018-03-26 17:22:18 +02:00
Matthew Honnibal
a3d0cb15d3
Fix ent_iob tags in doc.merge to avoid inconsistent sequences
2018-03-26 07:16:06 +02:00
Matthew Honnibal
7d4687162f
Update doc.ents test
2018-03-26 07:14:35 +02:00
Matthew Honnibal
514d89a3ae
Set missing label for non-specified entities when setting doc.ents
2018-03-26 07:14:16 +02:00
Matthew Honnibal
54d7a1c916
Improve error message when entity sequence is inconsistent
2018-03-26 07:13:34 +02:00
Justin DuJardin
4eeb178856
Add example using TensorBoard standalone projector
...
- the tensorboard standalone project expects a different set of files than the plugin to TensorFlow.
2018-03-25 21:50:13 -07:00
Matthew Honnibal
938436455a
Add test for ent_iob during span merge
2018-03-25 22:16:19 +02:00
Matthew Honnibal
8e08c378fe
Fix entity IOB and tag in span merging
2018-03-25 22:16:01 +02:00
Matthew Honnibal
5430c43298
Set about to spacy-nightly
2018-03-25 19:30:14 +02:00
Matthew Honnibal
c059fcb0ba
Update thinc requirement
2018-03-25 19:29:36 +02:00
Ines Montani
68226109f4
Merge pull request #2142 from jimregan/polish-more-tokens
...
more exceptions
2018-03-24 19:06:44 +01:00
Matthew Honnibal
d566e673bf
Set version to v2.0.10
2018-03-24 18:09:03 +01:00
Matthew Honnibal
0d3bf0d4eb
Merge branch 'master' of https://github.com/explosion/spaCy
2018-03-24 17:31:49 +01:00
dejanmarich
ccd1c04c63
Update stop_words.py
...
Added more words
2018-03-24 17:31:24 +01:00
ines
f1446b0257
Port over Turkish changes
2018-03-24 17:31:07 +01:00
DuyguA
cd604878a4
quick typo fix
2018-03-24 17:26:35 +01:00
Matthew Honnibal
406548b976
Support .gz and .tar.gz files in spacy init-model
2018-03-24 17:18:32 +01:00