Adriane Boyd
a4b32b9552
Handle missing reference values in scorer ( #6286 )
...
* Handle missing reference values in scorer
Handle missing values in reference doc during scoring where it is
possible to detect an unset state for the attribute. If no reference
docs contain annotation, `None` is returned instead of a score. `spacy
evaluate` displays `-` for missing scores and the missing scores are
saved as `None`/`null` in the metrics.
Attributes without unset states:
* `token.head`: relies on `token.dep` to recognize unset values
* `doc.cats`: unable to handle missing annotation
Additional changes:
* add optional `has_annotation` check to `score_scans` to replace
`doc.sents` hack
* update `score_token_attr_per_feat` to handle missing and empty morph
representations
* fix bug in `Doc.has_annotation` for normalization of `IS_SENT_START`
vs. `SENT_START`
* Fix import
* Update return types
2020-11-03 15:47:18 +01:00
Adriane Boyd
5d2cb86c34
Fix on_match callback for DependencyMatcher ( #6313 )
...
Fix `DependencyMatcher` so that the callback is called only once per
match.
2020-10-31 12:20:27 +01:00
Sofie Van Landeghem
2918923541
fix resolving of dot notation ( #6326 )
2020-10-31 12:17:06 +01:00
Ines Montani
2c9804038d
Fix success message [ci skip]
2020-10-23 16:11:54 +02:00
Adriane Boyd
563a21834e
Save raw scores in evaluate output
2020-10-19 15:49:09 +02:00
Adriane Boyd
dd207ca6d0
Add dep_las_per_type and more generic PRF printer
2020-10-19 15:49:02 +02:00
Adriane Boyd
4300858ecb
Include per-type/feat scores in evaluate output
2020-10-19 15:48:55 +02:00
Sofie Van Landeghem
75a202ce65
TextCat updates and fixes ( #6263 )
...
* small fix in example imports
* throw error when train_corpus or dev_corpus is not a string
* small fix in custom logger example
* limit macro_auc to labels with 2 annotations
* fix typo
* also create parents of output_dir if need be
* update documentation of textcat scores
* refactor TextCatEnsemble
* fix tests for new AUC definition
* bump to 3.0.0a42
* update docs
* rename to spacy.TextCatEnsemble.v2
* spacy.TextCatEnsemble.v1 in legacy
* cleanup
* small fix
* update to 3.0.0rc2
* fix import that got lost in merge
* cursed IDE
* fix two typos
2020-10-18 14:50:41 +02:00
Ines Montani
5a6ed01ce0
Merge pull request #6262 from adrianeboyd/bugfix/template-en-vectors
2020-10-16 15:38:08 +02:00
Adriane Boyd
c8d04b79e2
Sort and add vectors for langs without transformers
2020-10-16 08:25:16 +02:00
Adriane Boyd
2fbd43c603
Use core lg models as vectors models in quickstart
2020-10-16 08:17:53 +02:00
Jan Margeta
1ad2213349
Fix TokenPatternSchema pattern field validation
...
Empty pattern field should be considered invalid
This is fixed by replacing minItems with min_items
as described in Pydantic docs:
https://pydantic-docs.helpmanual.io/usage/schema/
2020-10-16 00:41:21 +02:00
Ines Montani
ff4267d181
Fix success message [ci skip]
2020-10-15 14:42:08 +02:00
Ines Montani
10611bf56a
Increment version [ci skip]
2020-10-15 13:30:11 +02:00
Ines Montani
4e17ddf75e
Merge pull request #6256 from adrianeboyd/bugfix/docs-to-json-raw
2020-10-15 10:35:01 +02:00
Ines Montani
b1d568a4df
Tidy up tests
2020-10-15 10:20:21 +02:00
Ines Montani
d165af26be
Auto-format [ci skip]
2020-10-15 10:08:53 +02:00
Adriane Boyd
a93d42861d
Use null raw for has_unknown_spaces in docs_to_json
2020-10-15 09:57:54 +02:00
Ines Montani
5665a21517
Tidy up
2020-10-15 09:30:32 +02:00
Ines Montani
5d62499266
Fix tests
2020-10-15 09:29:15 +02:00
Ines Montani
178760855f
Merge branch 'develop' into master-tmp
2020-10-15 09:06:03 +02:00
Ines Montani
bc85b12e6d
Merge pull request #6249 from svlandeg/feature/batch-tests
2020-10-15 08:57:56 +02:00
svlandeg
0796401c19
call NumpyOps instead of get_current_ops()
2020-10-14 16:55:00 +02:00
svlandeg
44e14ccae8
one more losses fix
2020-10-14 15:11:34 +02:00
svlandeg
0aa8851878
always return losses
2020-10-14 15:00:49 +02:00
svlandeg
e94a21638e
adding tests for trained models to ensure predict reproducibility
2020-10-13 21:07:13 +02:00
svlandeg
ede979d42f
formattting
2020-10-13 18:53:17 +02:00
svlandeg
ff83bfae3f
naming
2020-10-13 18:52:37 +02:00
svlandeg
6ccacff54e
add tests for individual spacy layers
2020-10-13 18:50:07 +02:00
svlandeg
c23041ae60
component tests single or multiple prediction
2020-10-13 16:26:53 +02:00
Ines Montani
1f49300862
Update transformer recommendations [ci skip]
2020-10-13 15:41:17 +02:00
Sofie Van Landeghem
f8a1c1afd6
avoid dropout at runtime ( #6247 )
2020-10-13 14:39:59 +02:00
Ines Montani
86d648740f
Fix morph representation in Doc.to_json
2020-10-13 11:39:03 +02:00
Ines Montani
7f92a5ee6a
Update spacy/lang/ta/examples.py
2020-10-13 11:03:35 +02:00
Ines Montani
a0e12c136b
Increment version [ci skip]
2020-10-13 10:00:53 +02:00
Ines Montani
f090f39f17
Merge pull request #6245 from svlandeg/bugfix/else
...
bugfix in _pipe
2020-10-13 09:59:06 +02:00
svlandeg
1f465bea18
if-else
2020-10-13 09:27:19 +02:00
svlandeg
40276fd3be
update NEL docs after latest refactor
2020-10-12 11:41:27 +02:00
Ines Montani
4fa967ea84
Increment version [ci skip]
2020-10-11 13:10:58 +02:00
Ines Montani
ab890a35f9
Make console logger table more compact
2020-10-11 12:55:46 +02:00
Ines Montani
99606e46fe
Relax meta.json schema [ci skip]
2020-10-11 12:30:57 +02:00
svlandeg
3a505e7e14
small edit to ensure the new word was indeed new
2020-10-10 21:05:28 +02:00
svlandeg
68d79796c6
add test for vocab after serializing KB
2020-10-10 20:59:48 +02:00
Ines Montani
539b0c10da
Tidy up and auto-format
2020-10-10 19:14:48 +02:00
Ines Montani
bfa3931c9d
Revert added_strings change ( #6236 )
2020-10-10 18:55:07 +02:00
Ines Montani
796f8b9424
Increment version
2020-10-09 18:00:27 +02:00
Ines Montani
525f798841
Fix typo in test
2020-10-09 18:00:21 +02:00
Ines Montani
8ac5f22253
Adjust error message
2020-10-09 18:00:16 +02:00
svlandeg
08cb085f6c
Merge remote-tracking branch 'upstream/develop' into fix/various
2020-10-09 17:01:27 +02:00
Ines Montani
b7cb9d95e4
Merge pull request #6229 from svlandeg/bugfix/disabled
2020-10-09 16:05:11 +02:00