mirror of
https://github.com/explosion/spaCy.git
synced 2025-02-03 13:14:11 +03:00
1139247532
* Revert changes to priority of `token_match` so that it has priority over all other tokenizer patterns * Add lookahead and potentially slow lookbehind back to the default URL pattern * Expand character classes in URL pattern to improve matching around lookaheads and lookbehinds related to #4882 * Revert changes to Hungarian tokenizer * Revert (xfail) several URL tests to their status before #4374 * Update `tokenizer.explain()` and docs accordingly |
||
---|---|---|
.. | ||
__init__.py | ||
sun.txt | ||
test_exceptions.py | ||
test_explain.py | ||
test_naughty_strings.py | ||
test_tokenizer.py | ||
test_urls.py | ||
test_whitespace.py |