Matthew Honnibal
7299419fe4
Dont load vectors in Language.from_config
2020-07-29 14:00:12 +02:00
Matthew Honnibal
30dd96c540
Load vectors in Language.from_config
2020-07-29 14:00:12 +02:00
Matthew Honnibal
df95e2af64
Add load_vectors_into_model util
2020-07-29 14:00:12 +02:00
Matthew Honnibal
475d7c1c7c
Fix StaticVectors class
2020-07-29 14:00:11 +02:00
Matthew Honnibal
44d350dc94
Use spaCy's StaticVectors
2020-07-29 14:00:11 +02:00
Matthew Honnibal
984754e3be
Update config
2020-07-29 14:00:11 +02:00
Matthew Honnibal
acc64e138a
Add import
2020-07-29 14:00:11 +02:00
Matthew Honnibal
9987ea9e4d
Fix Tok2Vec begin_training
2020-07-29 14:00:10 +02:00
Matthew Honnibal
099e9331c5
Fix tok2vec
2020-07-29 14:00:10 +02:00
Matthew Honnibal
fe0cdcd461
Fixes
2020-07-29 14:00:09 +02:00
Matthew Honnibal
034d803b7a
Update ptb config
2020-07-29 14:00:09 +02:00
Matthew Honnibal
123f8b832d
Refactor Tok2Vec model
2020-07-29 14:00:09 +02:00
Matthew Honnibal
c6b4f63c7c
Remove obsolete function
2020-07-29 14:00:09 +02:00
Matthew Honnibal
9cc7262224
Draft StaticVectors layer
2020-07-29 14:00:09 +02:00
Matthew Honnibal
cb9654e98c
WIP on new StaticVectors
2020-07-29 14:00:09 +02:00
Ines Montani
e257e66ab9
Merge branch 'develop' of https://github.com/explosion/spaCy into develop
2020-07-29 11:36:45 +02:00
Ines Montani
e0ffe36e79
Update docstrings, docs and types
2020-07-29 11:36:42 +02:00
Sofie Van Landeghem
40c995b1be
Option for returning only greedy matches ( #5771 )
...
* add "greedy" option for match pattern
* distinction between greedy FIRST or LONGEST
* check for proper values, throw custom warning otherwise
* unxfail one more test
* add comment in docstring
* add test that LONGEST also prefers first match if equal length
* use c arrays for more efficient processing
* rename 'greediness' to 'greedy'
2020-07-29 11:04:43 +02:00
Adriane Boyd
191a12d75f
Fix score_weights typo in train CLI ( #5835 )
2020-07-29 11:04:12 +02:00
Adriane Boyd
0cddb0dbe9
Move timing into Language.evaluate ( #5836 )
...
Move timing into `Language.evaluate` so that only the processing is
timing, not processing + scoring. `Language.evaluate` returns
`scores["speed"]` as words per second, which should be identical to how
the speed was added to the scores previously. Also add the speed to the
evaluate CLI output.
2020-07-29 11:02:31 +02:00
Adriane Boyd
7a6ac47dc1
Remove keyword-only from Scorer API docs
2020-07-29 10:40:30 +02:00
Adriane Boyd
c689ae8f0a
Fix types in Scorer
2020-07-29 10:40:30 +02:00
oculusrepairo
03ab518f28
Update examples.py ( #5820 )
...
* Update examples.py
adding factual sentences to the list
* Add missing comma separators
Co-authored-by: Adriane Boyd <adrianeboyd@gmail.com>
2020-07-29 10:28:56 +02:00
Ines Montani
7adffc5361
Remove unused schema
2020-07-28 23:12:47 +02:00
Ines Montani
e5d9eaf79c
Tidy up docstrings and arguments
2020-07-28 23:12:42 +02:00
Ines Montani
ac24adec73
Small adjustments to Scorer and docs
2020-07-28 21:39:42 +02:00
Ines Montani
256b24b720
Update arch docs WIP [ci skip]
2020-07-28 20:33:52 +02:00
Ines Montani
2c7a32cf12
Remove unused methods
2020-07-28 16:50:02 +02:00
Ines Montani
ba22111ff4
Move error to Errors
2020-07-28 16:24:14 +02:00
Ines Montani
2748249217
Re-add meta["pipeline"] for now
2020-07-28 16:14:23 +02:00
Ines Montani
b83ead5bf5
Merge pull request #5824 from svlandeg/fix/textcat-v3
2020-07-28 15:04:25 +02:00
Ines Montani
06a97a8766
Support --opt=value format in CLI config overrides
2020-07-28 13:43:15 +02:00
Ines Montani
ae4d8a6ffd
Update docstrings, docs and pipe consistency
2020-07-28 13:37:31 +02:00
Ines Montani
0094cb0d04
Remove scores list from config and document
2020-07-28 11:22:24 +02:00
Ines Montani
9b704c3db3
Merge pull request #5819 from explosion/feature/component-scores
2020-07-28 10:40:56 +02:00
graue70
b97dbab998
Fix typo in unit tests ( #5823 )
2020-07-27 20:18:48 +02:00
Ines Montani
2f83848b1f
Fix title [ci skip]
2020-07-27 18:25:38 +02:00
Ines Montani
894e20c466
Merge branch 'develop' into feature/component-scores
2020-07-27 18:14:39 +02:00
Ines Montani
d8b519c23c
API docs, docstrings and argument consistency
2020-07-27 18:11:45 +02:00
svlandeg
85b2dcfd67
cleanup
2020-07-27 17:54:44 +02:00
svlandeg
8353ca5a51
remove printing of config
2020-07-27 17:53:36 +02:00
svlandeg
61068e0fb1
util function dot_to_object and corresponding unit test
2020-07-27 17:50:12 +02:00
Ines Montani
10b84e1e27
Add flag to toggle sdist creation on package [ci skip]
2020-07-27 16:52:23 +02:00
svlandeg
674c39bff9
fix train_textcat script
2020-07-27 16:48:21 +02:00
Adriane Boyd
fdf09cb231
Update Scorer API docs for score_cats
2020-07-27 15:34:42 +02:00
Adriane Boyd
34c92dfe63
Add missing Scorer imports
2020-07-27 15:08:51 +02:00
Adriane Boyd
8bb0507777
Add and update score methods and score weights
...
Add and update `score` methods, provided `scores`, and default weights
`default_score_weights` for pipeline components.
* `scores` provides all top-level keys returned by `score` (merely informative, similar to `assigns`).
* `default_score_weights` provides the default weights for a default config.
* The keys from `default_score_weights` determine which values will be
shown in the `spacy train` output, so keys with weight `0.0` will be
displayed but not counted toward the overall score.
2020-07-27 14:44:53 +02:00
Adriane Boyd
baf19fd652
Update cats scoring to provide overall score
...
* Provide top-level score as `attr_score`
* Provide a description of the score as `attr_score_desc`
* Provide all potential scores keys, setting unused keys to `None`
* Update CLI evaluate accordingly
2020-07-27 12:26:10 +02:00
Adriane Boyd
f8cf378be9
Combine weights from multiple components
...
Combine weights from multiple components for the same score.
2020-07-27 10:21:31 +02:00
Adriane Boyd
2880d8a555
Normalize spelling for spaCy ( #5822 )
2020-07-27 10:09:33 +02:00