spaCy

mirror of https://github.com/explosion/spaCy.git synced 2025-12-06 01:34:25 +03:00

History

Adriane Boyd f4339f9bff Fix tokenizer cache flushing (#7836 ) * Fix tokenizer cache flushing Fix/simplify tokenizer init detection in order to fix cache flushing when properties are modified. * Remove init reloading logic * Remove logic disabling `_reload_special_cases` on init * Setting `rules` last in `__init__` (as before) means that setting other properties doesn't reload any special cases * Reset `rules` first in `from_bytes` so that setting other properties during deserialization doesn't reload any special cases unnecessarily * Reset all properties in `Tokenizer.from_bytes` to allow any settings to be `None` * Also reset special matcher when special cache is flushed * Remove duplicate special case validation * Add test for special cases flushing * Extend test for tokenizer deserialization of None values		2021-04-22 18:14:57 +10:00
..
__init__.py	Revert #4334	2019-09-29 17:32:12 +02:00
test_resource_warning.py	Tidy up tests	2020-10-15 10:20:21 +02:00
test_serialize_config.py	Ensure hyphen in config file works as string value (#7642 )	2021-04-12 14:35:57 +02:00
test_serialize_doc.py	Add SpanGroup and Graph container types to represent arbitrary annotations (#6696 )	2021-01-14 17:30:41 +11:00
test_serialize_extension_attrs.py	Merge branch 'master' into develop	2020-02-18 14:47:23 +01:00
test_serialize_kb.py	consistently use registry as callable	2021-03-02 17:56:28 +01:00
test_serialize_language.py	Remove dead and/or deprecated code (#5710 )	2020-07-06 13:06:25 +02:00
test_serialize_pipeline.py	multi-label textcat component (#6474 )	2021-01-06 13:07:14 +11:00
test_serialize_tokenizer.py	Fix tokenizer cache flushing (#7836 )	2021-04-22 18:14:57 +10:00
test_serialize_vocab_strings.py	Make vocab update in get_docs deterministic (#7603 )	2021-04-09 11:53:13 +02:00