Ines Montani
e257e66ab9
Merge branch 'develop' of https://github.com/explosion/spaCy into develop
2020-07-29 11:36:45 +02:00
Ines Montani
e0ffe36e79
Update docstrings, docs and types
2020-07-29 11:36:42 +02:00
Sofie Van Landeghem
40c995b1be
Option for returning only greedy matches ( #5771 )
...
* add "greedy" option for match pattern
* distinction between greedy FIRST or LONGEST
* check for proper values, throw custom warning otherwise
* unxfail one more test
* add comment in docstring
* add test that LONGEST also prefers first match if equal length
* use c arrays for more efficient processing
* rename 'greediness' to 'greedy'
2020-07-29 11:04:43 +02:00
Adriane Boyd
191a12d75f
Fix score_weights typo in train CLI ( #5835 )
2020-07-29 11:04:12 +02:00
Adriane Boyd
0cddb0dbe9
Move timing into Language.evaluate ( #5836 )
...
Move timing into `Language.evaluate` so that only the processing is
timing, not processing + scoring. `Language.evaluate` returns
`scores["speed"]` as words per second, which should be identical to how
the speed was added to the scores previously. Also add the speed to the
evaluate CLI output.
2020-07-29 11:02:31 +02:00
Ines Montani
7adffc5361
Remove unused schema
2020-07-28 23:12:47 +02:00
Ines Montani
e5d9eaf79c
Tidy up docstrings and arguments
2020-07-28 23:12:42 +02:00
Ines Montani
256b24b720
Update arch docs WIP [ci skip]
2020-07-28 20:33:52 +02:00
Ines Montani
2c7a32cf12
Remove unused methods
2020-07-28 16:50:02 +02:00
Ines Montani
ba22111ff4
Move error to Errors
2020-07-28 16:24:14 +02:00
Ines Montani
2748249217
Re-add meta["pipeline"] for now
2020-07-28 16:14:23 +02:00
Ines Montani
b83ead5bf5
Merge pull request #5824 from svlandeg/fix/textcat-v3
2020-07-28 15:04:25 +02:00
Ines Montani
06a97a8766
Support --opt=value format in CLI config overrides
2020-07-28 13:43:15 +02:00
Ines Montani
ae4d8a6ffd
Update docstrings, docs and pipe consistency
2020-07-28 13:37:31 +02:00
Ines Montani
0094cb0d04
Remove scores list from config and document
2020-07-28 11:22:24 +02:00
Ines Montani
9b704c3db3
Merge pull request #5819 from explosion/feature/component-scores
2020-07-28 10:40:56 +02:00
Ines Montani
2f83848b1f
Fix title [ci skip]
2020-07-27 18:25:38 +02:00
Ines Montani
894e20c466
Merge branch 'develop' into feature/component-scores
2020-07-27 18:14:39 +02:00
Ines Montani
d8b519c23c
API docs, docstrings and argument consistency
2020-07-27 18:11:45 +02:00
svlandeg
85b2dcfd67
cleanup
2020-07-27 17:54:44 +02:00
svlandeg
8353ca5a51
remove printing of config
2020-07-27 17:53:36 +02:00
svlandeg
61068e0fb1
util function dot_to_object and corresponding unit test
2020-07-27 17:50:12 +02:00
Ines Montani
10b84e1e27
Add flag to toggle sdist creation on package [ci skip]
2020-07-27 16:52:23 +02:00
svlandeg
674c39bff9
fix train_textcat script
2020-07-27 16:48:21 +02:00
Adriane Boyd
fdf09cb231
Update Scorer API docs for score_cats
2020-07-27 15:34:42 +02:00
Adriane Boyd
34c92dfe63
Add missing Scorer imports
2020-07-27 15:08:51 +02:00
Adriane Boyd
8bb0507777
Add and update score methods and score weights
...
Add and update `score` methods, provided `scores`, and default weights
`default_score_weights` for pipeline components.
* `scores` provides all top-level keys returned by `score` (merely informative, similar to `assigns`).
* `default_score_weights` provides the default weights for a default config.
* The keys from `default_score_weights` determine which values will be
shown in the `spacy train` output, so keys with weight `0.0` will be
displayed but not counted toward the overall score.
2020-07-27 14:44:53 +02:00
Adriane Boyd
baf19fd652
Update cats scoring to provide overall score
...
* Provide top-level score as `attr_score`
* Provide a description of the score as `attr_score_desc`
* Provide all potential scores keys, setting unused keys to `None`
* Update CLI evaluate accordingly
2020-07-27 12:26:10 +02:00
Adriane Boyd
f8cf378be9
Combine weights from multiple components
...
Combine weights from multiple components for the same score.
2020-07-27 10:21:31 +02:00
Ines Montani
7dd53d0964
Fix typo [ci skip]
2020-07-27 00:34:00 +02:00
Ines Montani
7adbaf9a5b
Update docs [ci skip]
2020-07-27 00:29:45 +02:00
Ines Montani
3d56a3f286
Make more args keyword-only
2020-07-27 00:27:53 +02:00
Matthew Honnibal
80271ac0ba
Update default config
2020-07-26 15:27:39 +02:00
Ines Montani
ed61fb10fc
Rename default textcat arch to TextCatEnsemble
2020-07-26 15:11:43 +02:00
Ines Montani
53d37da29a
Make sure @factories is removed from config
2020-07-26 15:11:24 +02:00
Matthew Honnibal
ac5901d076
Merge branch 'develop' of https://github.com/explosion/spaCy into develop
2020-07-26 14:20:27 +02:00
Matthew Honnibal
fb5dbe30b5
Trim training 101
2020-07-26 13:43:22 +02:00
Matthew Honnibal
e6a7deb7cc
Edits to the training 101 section
2020-07-26 13:42:08 +02:00
Ines Montani
4060c2d5a6
Fix test
2020-07-26 13:40:19 +02:00
Ines Montani
2470486543
Allow pipeline components to set default scores and weights
2020-07-26 13:18:43 +02:00
Ines Montani
787d066e22
Remove pipes.pyx
...
Probably accidentally re-added in a merge?
2020-07-26 13:08:52 +02:00
Matthew Honnibal
520d25cb50
Add smart_open dependency to fetch project assets ( #5812 )
...
* Use smart_open for project assets
* Fix assets.py
* Update pyproject.toml
2020-07-26 12:15:00 +02:00
Ines Montani
c288dba8e7
Update docs [ci skip]
2020-07-25 18:51:12 +02:00
Ines Montani
1346ee06d4
Merge pull request #5813 from explosion/chore/tidy-autoformat-types
...
Tidy up, autoformat, add types
2020-07-25 18:44:08 +02:00
Ines Montani
eb9acae34d
Merge pull request #5791 from adrianeboyd/docs/morphology
2020-07-25 15:10:21 +02:00
Ines Montani
e92df281ce
Tidy up, autoformat, add types
2020-07-25 15:01:15 +02:00
Matthew Honnibal
71242327b2
Set version to v3.0.0a5
2020-07-25 14:06:01 +02:00
Matthew Honnibal
afd504f8c0
Update config
2020-07-25 14:04:25 +02:00
Ines Montani
cdbd6ba912
Merge pull request #5798 from explosion/feature/language-data-config
2020-07-25 13:34:49 +02:00
Matthew Honnibal
44a0b072e0
Merge branch 'feature/language-data-config' of https://github.com/explosion/spaCy into feature/language-data-config
2020-07-25 13:34:07 +02:00