mirror of
https://github.com/explosion/spaCy.git
synced 2025-10-24 04:31:17 +03:00
* Fix tokenizer cache flushing
Fix/simplify tokenizer init detection in order to fix cache flushing
when properties are modified.
* Remove init reloading logic
* Remove logic disabling `_reload_special_cases` on init
* Setting `rules` last in `__init__` (as before) means that setting
other properties doesn't reload any special cases
* Reset `rules` first in `from_bytes` so that setting other properties
during deserialization doesn't reload any special cases
unnecessarily
* Reset all properties in `Tokenizer.from_bytes` to allow any settings
to be `None`
* Also reset special matcher when special cache is flushed
* Remove duplicate special case validation
* Add test for special cases flushing
* Extend test for tokenizer deserialization of None values
|
||
|---|---|---|
| .. | ||
| __init__.py | ||
| test_resource_warning.py | ||
| test_serialize_config.py | ||
| test_serialize_doc.py | ||
| test_serialize_extension_attrs.py | ||
| test_serialize_kb.py | ||
| test_serialize_language.py | ||
| test_serialize_pipeline.py | ||
| test_serialize_tokenizer.py | ||
| test_serialize_vocab_strings.py | ||