spaCy

mirror of https://github.com/explosion/spaCy.git synced 2026-02-04 22:39:50 +03:00

History

Adriane Boyd 3711af74e5 Add tokenizer option to allow Matcher handling for all rules (#10452 ) * Add tokenizer option to allow Matcher handling for all rules Add tokenizer option `with_faster_rules_heuristics` that determines whether the special cases applied by the internal `Matcher` are filtered by whether they contain affixes or space. If `True` (default), the rules are filtered to prioritize speed over rare edge cases. If `False`, all rules are included in the final `Matcher`-based pass over the doc. * Reset all caches when reloading special cases * Revert "Reset all caches when reloading special cases" This reverts commit `4ef6bd171d`. * Initialize max_length properly * Add new tag to API docs * Rename to faster heuristics		2022-03-24 13:21:32 +01:00
..
__init__.py	Revert #4334	2019-09-29 17:32:12 +02:00
sun.txt	Revert #4334	2019-09-29 17:32:12 +02:00
test_exceptions.py	Ignore prefix in suffix matches (#9155 )	2021-10-27 13:02:25 +02:00
test_explain.py	Update Tokenizer.explain with special matches (#7749 )	2021-04-19 19:08:20 +10:00
test_naughty_strings.py	Merge branch 'develop' into master-tmp	2020-09-04 13:15:36 +02:00
test_tokenizer.py	Add tokenizer option to allow Matcher handling for all rules (#10452 )	2022-03-24 13:21:32 +01:00
test_urls.py	Merge branch 'develop' into master-tmp	2020-06-20 15:52:00 +02:00
test_whitespace.py	Merge branch 'develop' into master-tmp	2020-09-04 13:15:36 +02:00