mirror of
https://github.com/explosion/spaCy.git
synced 2024-12-27 10:26:35 +03:00
f4339f9bff
* Fix tokenizer cache flushing Fix/simplify tokenizer init detection in order to fix cache flushing when properties are modified. * Remove init reloading logic * Remove logic disabling `_reload_special_cases` on init * Setting `rules` last in `__init__` (as before) means that setting other properties doesn't reload any special cases * Reset `rules` first in `from_bytes` so that setting other properties during deserialization doesn't reload any special cases unnecessarily * Reset all properties in `Tokenizer.from_bytes` to allow any settings to be `None` * Also reset special matcher when special cache is flushed * Remove duplicate special case validation * Add test for special cases flushing * Extend test for tokenizer deserialization of None values |
||
---|---|---|
.. | ||
__init__.py | ||
sun.txt | ||
test_exceptions.py | ||
test_explain.py | ||
test_naughty_strings.py | ||
test_tokenizer.py | ||
test_urls.py | ||
test_whitespace.py |