spaCy/spacy/tests/pipeline
Adriane Boyd 63f5951f8b Add AttributeRuler for token attribute exceptions
Add the `AttributeRuler` to handle exceptions for token-level
attributes. The `AttributeRuler` uses `Matcher` patterns to identify
target spans and applies the specified attributes to the token at the
provided index in the matched span. A negative index can be used to
index from the end of the matched span. The retokenizer is used to
"merge" the individual tokens and assign them the provided attributes.

Helper functions can import existing tag maps and morph rules to the
corresponding `Matcher` patterns.

There is an additional minor bug fix for `MORPH` attributes in the
retokenizer to correctly normalize the values and to handle `MORPH`
alongside `_` in an attrs dict.
2020-07-30 09:10:59 +02:00
..
__init__.py Revert #4334 2019-09-29 17:32:12 +02:00
test_analysis.py Refactor pipeline components, config and language data (#5759) 2020-07-22 13:42:59 +02:00
test_attributeruler.py Add AttributeRuler for token attribute exceptions 2020-07-30 09:10:59 +02:00
test_entity_linker.py Refactor pipeline components, config and language data (#5759) 2020-07-22 13:42:59 +02:00
test_entity_ruler.py Refactor pipeline components, config and language data (#5759) 2020-07-22 13:42:59 +02:00
test_functions.py Refactor pipeline components, config and language data (#5759) 2020-07-22 13:42:59 +02:00
test_morphologizer.py Refactor pipeline components, config and language data (#5759) 2020-07-22 13:42:59 +02:00
test_pipe_factories.py Remove scores list from config and document 2020-07-28 11:22:24 +02:00
test_pipe_methods.py Refactor pipeline components, config and language data (#5759) 2020-07-22 13:42:59 +02:00
test_sentencizer.py Refactor pipeline components, config and language data (#5759) 2020-07-22 13:42:59 +02:00
test_senter.py Refactor pipeline components, config and language data (#5759) 2020-07-22 13:42:59 +02:00
test_simple_ner.py Tidy up and auto-format 2020-06-20 14:15:04 +02:00
test_tagger.py Refactor pipeline components, config and language data (#5759) 2020-07-22 13:42:59 +02:00
test_textcat.py Merge branch 'develop' into feature/component-scores 2020-07-27 18:14:39 +02:00