Matthew Honnibal
79dc241caa
Set pretrained_vectors in parser cfg
2018-03-28 17:35:07 +02:00
Matthew Honnibal
17c3e7efa2
Add message noting vectors
2018-03-28 16:33:43 +02:00
Matthew Honnibal
9bf6e93b3e
Set pretrained_vectors in begin_training
2018-03-28 16:32:41 +02:00
Matthew Honnibal
95a9615221
Fix loading of multiple pre-trained vectors
...
This patch addresses #1660 , which was caused by keying all pre-trained
vectors with the same ID when telling Thinc how to refer to them. This
meant that if multiple models were loaded that had pre-trained vectors,
errors or incorrect behaviour resulted.
The vectors class now includes a .name attribute, which defaults to:
{nlp.meta['lang']_nlp.meta['name']}.vectors
The vectors name is set in the cfg of the pipeline components under the
key pretrained_vectors. This replaces the previous cfg key
pretrained_dims.
In order to make existing models compatible with this change, we check
for the pretrained_dims key when loading models in from_disk and
from_bytes, and add the cfg key pretrained_vectors if we find it.
2018-03-28 16:02:59 +02:00
ines
07b8c255a5
Updatee example with note to install requests
2018-03-28 12:46:27 +02:00
ines
366c98a94b
Remove requests dependency
2018-03-28 12:46:18 +02:00
ines
7fbc9e5874
Replace requests with urllib
2018-03-28 12:46:07 +02:00
ines
da1f200362
Add compat helpers for urllib
2018-03-28 12:45:53 +02:00
ines
ac88c72c9a
Fix ftfy workaround and remove old import
2018-03-28 12:14:28 +02:00
ines
ce6071ca89
Remove ftfy dependency and update docs
2018-03-28 12:09:42 +02:00
Matthew Honnibal
070b6c6495
Remove dependency on ftfy
2018-03-28 12:07:02 +02:00
ines
6d2c85f428
Drop six and related hacks as a dependency
2018-03-28 10:45:25 +02:00
ines
9e83513004
Add position of invalid token to error message
2018-03-27 23:56:59 +02:00
ines
11c4735ccf
Fix issue in Italian lemmatizer data ( resolves #2050 )
2018-03-27 23:55:22 +02:00
ines
693971dd8f
Improve error message if token text is empty string (see #2101 )
2018-03-27 22:25:40 +02:00
ines
0c829e6605
Fix whitespace
2018-03-27 22:20:59 +02:00
Ines Montani
e0ae390607
Update CONTRIBUTING.md
2018-03-27 13:47:00 +02:00
Matthew Honnibal
d4680e4d83
Merge branch 'master' of https://github.com/explosion/spaCy
2018-03-27 13:36:37 +02:00
Matthew Honnibal
63a267b34d
Fix #2073 : Token.set_extension not working
2018-03-27 13:36:20 +02:00
Ines Montani
284bbb1dd1
Merge pull request #2146 from justindujardin/tensorboard-standalone-example
...
Add example using TensorBoard standalone projector
2018-03-27 13:23:32 +02:00
Justin DuJardin
4eeb178856
Add example using TensorBoard standalone projector
...
- the tensorboard standalone project expects a different set of files than the plugin to TensorFlow.
2018-03-25 21:50:13 -07:00
Ines Montani
68226109f4
Merge pull request #2142 from jimregan/polish-more-tokens
...
more exceptions
2018-03-24 19:06:44 +01:00
Matthew Honnibal
d566e673bf
Set version to v2.0.10
2018-03-24 18:09:03 +01:00
Matthew Honnibal
0d3bf0d4eb
Merge branch 'master' of https://github.com/explosion/spaCy
2018-03-24 17:31:49 +01:00
dejanmarich
ccd1c04c63
Update stop_words.py
...
Added more words
2018-03-24 17:31:24 +01:00
ines
f1446b0257
Port over Turkish changes
2018-03-24 17:31:07 +01:00
DuyguA
cd604878a4
quick typo fix
2018-03-24 17:26:35 +01:00
Matthew Honnibal
406548b976
Support .gz and .tar.gz files in spacy init-model
2018-03-24 17:18:32 +01:00
ines
6173c4aaa6
Port over contributor agreements
2018-03-24 17:17:37 +01:00
ines
4ec2809eb5
Port over TensorBoard example
2018-03-24 17:15:48 +01:00
ines
5ecc60cf3b
Add book to resources [ci skip]
2018-03-24 17:12:56 +01:00
ines
53680642af
Port over docs changes [ci skip]
2018-03-24 17:12:48 +01:00
Matthew Honnibal
74cc6bb06a
Merge branch 'master' into hotfix/v2.0.9
2018-03-24 17:08:13 +01:00
Matthew Honnibal
11fc69d6ef
Merge remote-tracking branch 'origin'
2018-03-24 17:07:50 +01:00
Matthew Honnibal
48f3606a8a
Merge branch 'master' into hotfix/v2.0.9
2018-03-24 17:06:50 +01:00
Matthew Honnibal
d4cad89407
Merge branch 'develop'
2018-03-24 17:05:18 +01:00
Jim O'Regan
efe037e8be
more exceptions
2018-03-24 00:05:27 +00:00
Ines Montani
a218579be7
Merge pull request #2141 from ottosulin/fin_examples
...
Finnish examples
2018-03-23 22:57:28 +01:00
Ines Montani
719037cf20
Update formatting and add missing commas
2018-03-23 22:18:20 +01:00
Ines Montani
2b68361501
Merge pull request #2140 from ottosulin/ottosulin_contributor [ci skip]
...
My contributor agreement
2018-03-23 22:14:34 +01:00
Otto Sulin
266efc2018
Added Finnish examples
2018-03-23 22:58:52 +02:00
Ines Montani
cd97a44894
Merge pull request #2137 from justindujardin/tensorboard-example
...
Add example for visualizing word vectors with TensorBoard Projector
2018-03-23 21:47:13 +01:00
Otto Sulin
82acb8f399
My contributor agreement
2018-03-23 22:46:58 +02:00
Otto Sulin
1940e54602
Added Finnish numbers
2018-03-23 22:33:08 +02:00
Otto Sulin
4ec3f19e2b
fixed stop words -> to-do lex_attrs.py
2018-03-23 22:18:17 +02:00
Justin DuJardin
c7ff8ee66c
Add contributor agreement
2018-03-23 13:11:56 -07:00
Justin DuJardin
eef9430f07
Add example for visualizing word vectors with TensorBoard Projector
...
Use:
```bash
python vectors_tensorboard.py en_core_web_lg ./output_folder spaCy_large
```
2018-03-23 12:49:01 -07:00
Matthew Honnibal
85717f570c
Merge branch 'master' of https://github.com/explosion/spaCy
2018-03-23 20:30:42 +01:00
Matthew Honnibal
8902754f0b
Fix vector loading for ud_train
2018-03-23 20:30:00 +01:00
Ines Montani
782ec6f4f2
Merge pull request #2131 from calumcalder/fix-displacy-docs-typo
...
Fix typo in documentation for displacy Visualizer
2018-03-23 13:03:00 +01:00