mirror of
https://github.com/explosion/spaCy.git
synced 2024-12-26 18:06:29 +03:00
53c0fb7431
* Only set NORM on Token in retokenizer Instead of setting `NORM` on both the token and lexeme, set `NORM` only on the token. The retokenizer tries to set all possible attributes with `Token/Lexeme.set_struct_attr` so that it doesn't have to enumerate which attributes are available for each. `NORM` is the only attribute that's stored on both and for most cases it doesn't make sense to set the global norms based on a individual retokenization. For lexeme-only attributes like `IS_STOP` there's no way to avoid the global side effects, but I think that `NORM` would be better only on the token. * Fix test |
||
---|---|---|
.. | ||
__init__.py | ||
test_add_entities.py | ||
test_array.py | ||
test_creation.py | ||
test_doc_api.py | ||
test_morphanalysis.py | ||
test_pickle_doc.py | ||
test_retokenize_merge.py | ||
test_retokenize_split.py | ||
test_span.py | ||
test_to_json.py | ||
test_token_api.py | ||
test_underscore.py |