spaCy

mirror of https://github.com/explosion/spaCy.git synced 2025-01-12 02:06:31 +03:00

History

Adriane Boyd 53c0fb7431 Only set NORM on Token in retokenizer (#6464 ) * Only set NORM on Token in retokenizer Instead of setting `NORM` on both the token and lexeme, set `NORM` only on the token. The retokenizer tries to set all possible attributes with `Token/Lexeme.set_struct_attr` so that it doesn't have to enumerate which attributes are available for each. `NORM` is the only attribute that's stored on both and for most cases it doesn't make sense to set the global norms based on a individual retokenization. For lexeme-only attributes like `IS_STOP` there's no way to avoid the global side effects, but I think that `NORM` would be better only on the token. * Fix test		2020-11-30 09:35:42 +08:00
..
__init__.py	Revert #4334	2019-09-29 17:32:12 +02:00
test_add_entities.py	Fix test imports	2019-09-29 17:34:56 +02:00
test_array.py	Tidy up and auto-format	2020-03-25 12:28:12 +01:00
test_creation.py	Tidy up and auto-format	2020-05-21 14:14:01 +02:00
test_doc_api.py	Add ent_id_ to strings serialized with Doc (#6353 )	2020-11-10 20:16:07 +08:00
test_morphanalysis.py	Revert #4334	2019-09-29 17:32:12 +02:00
test_pickle_doc.py	Revert #4334	2019-09-29 17:32:12 +02:00
test_retokenize_merge.py	Only set NORM on Token in retokenizer (#6464 )	2020-11-30 09:35:42 +08:00
test_retokenize_split.py	Fix norm in retokenizer split (#6111 )	2020-09-22 21:53:33 +02:00
test_span.py	Fix/span.sent (#6083 )	2020-10-01 14:01:52 +02:00
test_to_json.py	Revert #4334	2019-09-29 17:32:12 +02:00
test_token_api.py	Tidy up and auto-format	2020-05-21 14:14:01 +02:00
test_underscore.py	use clean_underscore fixture	2020-02-23 15:49:20 +01:00