Matthew Honnibal
a2d573c039
Merge branch 'feature/vectors' of https://github.com/explosion/spaCy into feature/vectors
2020-07-29 14:56:27 +02:00
Matthew Honnibal
2af741d7e3
Fix train arg
2020-07-29 14:56:01 +02:00
Matthew Honnibal
c27309f839
Merge branch 'develop' into feature/vectors
2020-07-29 14:54:10 +02:00
Ines Montani
62266fb828
Fix broken type annotation
2020-07-29 14:49:49 +02:00
Matthew Honnibal
142b58be92
Fix import
2020-07-29 14:45:09 +02:00
Matthew Honnibal
c99a653070
Adjust textcat model
2020-07-29 14:38:15 +02:00
Matthew Honnibal
9e1b11dd81
Update vectors in textcat
2020-07-29 14:35:36 +02:00
Matthew Honnibal
105cf29967
Fix DocBin
2020-07-29 14:23:13 +02:00
Ines Montani
ff0bc05da8
Fix docstrings [ci skip]
2020-07-29 14:09:37 +02:00
Ines Montani
6e2623d3f8
Fix docstring [ci skip]
2020-07-29 14:08:05 +02:00
Ines Montani
8d56260d92
Fix docstrings [ci skip]
2020-07-29 14:07:13 +02:00
Ines Montani
80b18124d2
Fix docstring [ci skip]
2020-07-29 14:03:35 +02:00
Matthew Honnibal
f0cf4a2dca
Update tests
2020-07-29 14:01:14 +02:00
Matthew Honnibal
07b47eaac8
Update tok2vec layer
2020-07-29 14:01:13 +02:00
Matthew Honnibal
5ae8628571
Fix CharacterEmbed layer
2020-07-29 14:01:13 +02:00
Matthew Honnibal
97d3651574
Fix stray link_vectors_to_models call
2020-07-29 14:01:13 +02:00
Matthew Honnibal
c7d1ece3eb
Update tests
2020-07-29 14:01:13 +02:00
Matthew Honnibal
00de30bcc2
Update CharacterEmbed function
2020-07-29 14:01:12 +02:00
Matthew Honnibal
6a6b09bd32
Update morphologizer model
2020-07-29 14:01:12 +02:00
Matthew Honnibal
20e9098e3f
Update tests
2020-07-29 14:01:12 +02:00
Matthew Honnibal
c35d6282fc
Add previous HashEmbedCNN tok2vec to make transition easier
2020-07-29 14:01:12 +02:00
Matthew Honnibal
1784c95827
Clean up link_vectors_to_models unused stuff
2020-07-29 14:01:11 +02:00
Matthew Honnibal
0c17ea4c85
Format
2020-07-29 14:00:13 +02:00
Matthew Honnibal
2aff3c4b5a
Load vectors in 'spacy train'
2020-07-29 14:00:13 +02:00
Matthew Honnibal
7852a68a75
Fix load_vectors_into_model function
2020-07-29 14:00:13 +02:00
Matthew Honnibal
7299419fe4
Dont load vectors in Language.from_config
2020-07-29 14:00:12 +02:00
Matthew Honnibal
30dd96c540
Load vectors in Language.from_config
2020-07-29 14:00:12 +02:00
Matthew Honnibal
df95e2af64
Add load_vectors_into_model util
2020-07-29 14:00:12 +02:00
Matthew Honnibal
475d7c1c7c
Fix StaticVectors class
2020-07-29 14:00:11 +02:00
Matthew Honnibal
44d350dc94
Use spaCy's StaticVectors
2020-07-29 14:00:11 +02:00
Matthew Honnibal
acc64e138a
Add import
2020-07-29 14:00:11 +02:00
Matthew Honnibal
9987ea9e4d
Fix Tok2Vec begin_training
2020-07-29 14:00:10 +02:00
Matthew Honnibal
099e9331c5
Fix tok2vec
2020-07-29 14:00:10 +02:00
Matthew Honnibal
fe0cdcd461
Fixes
2020-07-29 14:00:09 +02:00
Matthew Honnibal
123f8b832d
Refactor Tok2Vec model
2020-07-29 14:00:09 +02:00
Matthew Honnibal
c6b4f63c7c
Remove obsolete function
2020-07-29 14:00:09 +02:00
Matthew Honnibal
9cc7262224
Draft StaticVectors layer
2020-07-29 14:00:09 +02:00
Matthew Honnibal
cb9654e98c
WIP on new StaticVectors
2020-07-29 14:00:09 +02:00
Ines Montani
e257e66ab9
Merge branch 'develop' of https://github.com/explosion/spaCy into develop
2020-07-29 11:36:45 +02:00
Ines Montani
e0ffe36e79
Update docstrings, docs and types
2020-07-29 11:36:42 +02:00
Sofie Van Landeghem
40c995b1be
Option for returning only greedy matches ( #5771 )
...
* add "greedy" option for match pattern
* distinction between greedy FIRST or LONGEST
* check for proper values, throw custom warning otherwise
* unxfail one more test
* add comment in docstring
* add test that LONGEST also prefers first match if equal length
* use c arrays for more efficient processing
* rename 'greediness' to 'greedy'
2020-07-29 11:04:43 +02:00
Adriane Boyd
191a12d75f
Fix score_weights typo in train CLI ( #5835 )
2020-07-29 11:04:12 +02:00
Adriane Boyd
0cddb0dbe9
Move timing into Language.evaluate ( #5836 )
...
Move timing into `Language.evaluate` so that only the processing is
timing, not processing + scoring. `Language.evaluate` returns
`scores["speed"]` as words per second, which should be identical to how
the speed was added to the scores previously. Also add the speed to the
evaluate CLI output.
2020-07-29 11:02:31 +02:00
Ines Montani
7adffc5361
Remove unused schema
2020-07-28 23:12:47 +02:00
Ines Montani
e5d9eaf79c
Tidy up docstrings and arguments
2020-07-28 23:12:42 +02:00
Ines Montani
2c7a32cf12
Remove unused methods
2020-07-28 16:50:02 +02:00
Ines Montani
ba22111ff4
Move error to Errors
2020-07-28 16:24:14 +02:00
Ines Montani
2748249217
Re-add meta["pipeline"] for now
2020-07-28 16:14:23 +02:00
Ines Montani
b83ead5bf5
Merge pull request #5824 from svlandeg/fix/textcat-v3
2020-07-28 15:04:25 +02:00
Ines Montani
06a97a8766
Support --opt=value format in CLI config overrides
2020-07-28 13:43:15 +02:00