spaCy/spacy/tests/pipeline
Adriane Boyd 03fefa37e2
Add overwrite settings for more components (#9050)
* Add overwrite settings for more components

For pipeline components where it's relevant and not already implemented,
add an explicit `overwrite` setting that controls whether
`set_annotations` overwrites existing annotation.

For the `morphologizer`, add an additional setting `extend`, which
controls whether the existing features are preserved.

* +overwrite, +extend: overwrite values of existing features, add any new
features
* +overwrite, -extend: overwrite completely, removing any existing
features
* -overwrite, +extend: keep values of existing features, add any new
features
* -overwrite, -extend: do not modify the existing value if set

In all cases an unset value will be set by `set_annotations`.

Preserve current overwrite defaults:

* True: morphologizer, entity linker
* False: tagger, sentencizer, senter

* Add backwards compat overwrite settings

* Put empty line back

Removed by accident in last commit

* Set backwards-compatible defaults in __init__

Because the `TrainablePipe` serialization methods update `cfg`, there's
no straightforward way to detect whether models serialized with a
previous version are missing the overwrite settings.

It would be possible in the sentencizer due to its separate
serialization methods, however to keep the changes parallel, this also
sets the default in `__init__`.

* Remove traces

Co-authored-by: Paul O'Leary McCann <polm@dampfkraft.com>
2021-09-30 15:35:55 +02:00
..
__init__.py Revert #4334 2019-09-29 17:32:12 +02:00
test_analysis.py Simplify pipe analysis 2020-08-01 13:40:06 +02:00
test_annotates_on_update.py Tidy up and auto-format 2021-07-18 15:44:56 +10:00
test_attributeruler.py Refactor scoring methods to use registered functions (#8766) 2021-08-10 15:13:39 +02:00
test_entity_linker.py Tidy up with flake8: imports, comparisons, etc. 2021-06-28 12:08:15 +02:00
test_entity_ruler.py Filter W036 for entity ruler, etc. (#8424) 2021-06-21 09:34:29 +02:00
test_functions.py Add token_splitter component (#6726) 2021-01-17 19:54:41 +08:00
test_initialize.py Test with default value 2020-09-29 17:00:40 +02:00
test_lemmatizer.py Tidy up and auto-format 2021-07-18 15:44:56 +10:00
test_models.py Tidy up code 2021-06-28 12:08:15 +02:00
test_morphologizer.py Add overwrite settings for more components (#9050) 2021-09-30 15:35:55 +02:00
test_pipe_factories.py Tidy up code 2021-06-28 12:08:15 +02:00
test_pipe_methods.py Tidy up and auto-format 2021-07-18 15:44:56 +10:00
test_sentencizer.py Refactor Docs.is_ flags (#6044) 2020-09-17 00:14:01 +02:00
test_senter.py adding tests for trained models to ensure predict reproducibility 2020-10-13 21:07:13 +02:00
test_spancat.py Auto-format code with black (#9065) 2021-08-27 11:42:27 +02:00
test_tagger.py negative tag annotation (#8731) 2021-07-19 14:39:11 +02:00
test_textcat.py Raise an error for textcat with <2 labels (#8584) 2021-07-06 12:35:22 +02:00
test_tok2vec.py Tidy up code 2021-06-28 12:08:15 +02:00