Matthew Honnibal
95a9615221
Fix loading of multiple pre-trained vectors
...
This patch addresses #1660 , which was caused by keying all pre-trained
vectors with the same ID when telling Thinc how to refer to them. This
meant that if multiple models were loaded that had pre-trained vectors,
errors or incorrect behaviour resulted.
The vectors class now includes a .name attribute, which defaults to:
{nlp.meta['lang']_nlp.meta['name']}.vectors
The vectors name is set in the cfg of the pipeline components under the
key pretrained_vectors. This replaces the previous cfg key
pretrained_dims.
In order to make existing models compatible with this change, we check
for the pretrained_dims key when loading models in from_disk and
from_bytes, and add the cfg key pretrained_vectors if we find it.
2018-03-28 16:02:59 +02:00
Matthew Honnibal
070b6c6495
Remove dependency on ftfy
2018-03-28 12:07:02 +02:00
ines
6d2c85f428
Drop six and related hacks as a dependency
2018-03-28 10:45:25 +02:00
ines
9e83513004
Add position of invalid token to error message
2018-03-27 23:56:59 +02:00
ines
11c4735ccf
Fix issue in Italian lemmatizer data ( resolves #2050 )
2018-03-27 23:55:22 +02:00
ines
693971dd8f
Improve error message if token text is empty string (see #2101 )
2018-03-27 22:25:40 +02:00
ines
0c829e6605
Fix whitespace
2018-03-27 22:20:59 +02:00
Ines Montani
e0ae390607
Update CONTRIBUTING.md
2018-03-27 13:47:00 +02:00
Matthew Honnibal
d4680e4d83
Merge branch 'master' of https://github.com/explosion/spaCy
2018-03-27 13:36:37 +02:00
Matthew Honnibal
63a267b34d
Fix #2073 : Token.set_extension not working
2018-03-27 13:36:20 +02:00
Ines Montani
284bbb1dd1
Merge pull request #2146 from justindujardin/tensorboard-standalone-example
...
Add example using TensorBoard standalone projector
2018-03-27 13:23:32 +02:00
Justin DuJardin
4eeb178856
Add example using TensorBoard standalone projector
...
- the tensorboard standalone project expects a different set of files than the plugin to TensorFlow.
2018-03-25 21:50:13 -07:00
Ines Montani
68226109f4
Merge pull request #2142 from jimregan/polish-more-tokens
...
more exceptions
2018-03-24 19:06:44 +01:00
Matthew Honnibal
d566e673bf
Set version to v2.0.10
2018-03-24 18:09:03 +01:00
Matthew Honnibal
0d3bf0d4eb
Merge branch 'master' of https://github.com/explosion/spaCy
2018-03-24 17:31:49 +01:00
dejanmarich
ccd1c04c63
Update stop_words.py
...
Added more words
2018-03-24 17:31:24 +01:00
ines
f1446b0257
Port over Turkish changes
2018-03-24 17:31:07 +01:00
DuyguA
cd604878a4
quick typo fix
2018-03-24 17:26:35 +01:00
Matthew Honnibal
406548b976
Support .gz and .tar.gz files in spacy init-model
2018-03-24 17:18:32 +01:00
ines
6173c4aaa6
Port over contributor agreements
2018-03-24 17:17:37 +01:00
ines
4ec2809eb5
Port over TensorBoard example
2018-03-24 17:15:48 +01:00
ines
5ecc60cf3b
Add book to resources [ci skip]
2018-03-24 17:12:56 +01:00
ines
53680642af
Port over docs changes [ci skip]
2018-03-24 17:12:48 +01:00
Matthew Honnibal
74cc6bb06a
Merge branch 'master' into hotfix/v2.0.9
2018-03-24 17:08:13 +01:00
Matthew Honnibal
11fc69d6ef
Merge remote-tracking branch 'origin'
2018-03-24 17:07:50 +01:00
Matthew Honnibal
48f3606a8a
Merge branch 'master' into hotfix/v2.0.9
2018-03-24 17:06:50 +01:00
Matthew Honnibal
d4cad89407
Merge branch 'develop'
2018-03-24 17:05:18 +01:00
Jim O'Regan
efe037e8be
more exceptions
2018-03-24 00:05:27 +00:00
Ines Montani
a218579be7
Merge pull request #2141 from ottosulin/fin_examples
...
Finnish examples
2018-03-23 22:57:28 +01:00
Ines Montani
719037cf20
Update formatting and add missing commas
2018-03-23 22:18:20 +01:00
Ines Montani
2b68361501
Merge pull request #2140 from ottosulin/ottosulin_contributor [ci skip]
...
My contributor agreement
2018-03-23 22:14:34 +01:00
Otto Sulin
266efc2018
Added Finnish examples
2018-03-23 22:58:52 +02:00
Ines Montani
cd97a44894
Merge pull request #2137 from justindujardin/tensorboard-example
...
Add example for visualizing word vectors with TensorBoard Projector
2018-03-23 21:47:13 +01:00
Otto Sulin
82acb8f399
My contributor agreement
2018-03-23 22:46:58 +02:00
Otto Sulin
1940e54602
Added Finnish numbers
2018-03-23 22:33:08 +02:00
Otto Sulin
4ec3f19e2b
fixed stop words -> to-do lex_attrs.py
2018-03-23 22:18:17 +02:00
Justin DuJardin
c7ff8ee66c
Add contributor agreement
2018-03-23 13:11:56 -07:00
Justin DuJardin
eef9430f07
Add example for visualizing word vectors with TensorBoard Projector
...
Use:
```bash
python vectors_tensorboard.py en_core_web_lg ./output_folder spaCy_large
```
2018-03-23 12:49:01 -07:00
Matthew Honnibal
85717f570c
Merge branch 'master' of https://github.com/explosion/spaCy
2018-03-23 20:30:42 +01:00
Matthew Honnibal
8902754f0b
Fix vector loading for ud_train
2018-03-23 20:30:00 +01:00
Ines Montani
782ec6f4f2
Merge pull request #2131 from calumcalder/fix-displacy-docs-typo
...
Fix typo in documentation for displacy Visualizer
2018-03-23 13:03:00 +01:00
Xiaoquan Kong
a71b99d7ff
bugfix for global-variable-change-in-runtime related issue ( #2135 )
...
* Bugfix: setting pollution from spacy/cli/ud_train.py to whole package
* Add contributor agreement of howl-anderson
2018-03-23 11:36:38 +01:00
Calum Calder
d000b4323a
Add contributor agreement
2018-03-22 19:29:22 +00:00
Calum Calder
c6a0c1cc38
Fix typo in documentation for displacy Visualizer
...
The word_spacing variable affects the vertical spacing between the words and arcs, not the horizontal spacing.
2018-03-22 19:23:32 +00:00
Ines Montani
c94139e436
Merge pull request #2126 from iann0036/patch-1
...
Add contributor doc
2018-03-22 09:04:17 +01:00
Ines Montani
40c444eaae
Merge pull request #2127 from SebastinSanty/docs-patch
...
Docs patch
2018-03-22 09:03:50 +01:00
Sebastin Santy
793d29904f
Update _similarity.jade
2018-03-22 03:51:38 +05:30
Ian Mckay
c33d6ca360
Add contributor doc
2018-03-22 09:04:58 +11:00
Sebastin Santy
720d2231f6
Update doc.jade
2018-03-22 03:13:23 +05:30
Matthew Honnibal
044397e269
Support .gz and .tar.gz files in spacy init-model
2018-03-21 14:33:23 +01:00