mirror of
https://github.com/explosion/spaCy.git
synced 2025-11-03 09:27:56 +03:00
* Revert changes to priority of `token_match` so that it has priority over all other tokenizer patterns * Add lookahead and potentially slow lookbehind back to the default URL pattern * Expand character classes in URL pattern to improve matching around lookaheads and lookbehinds related to #4882 * Revert changes to Hungarian tokenizer * Revert (xfail) several URL tests to their status before #4374 * Update `tokenizer.explain()` and docs accordingly |
||
|---|---|---|
| .. | ||
| __init__.py | ||
| sun.txt | ||
| test_exceptions.py | ||
| test_explain.py | ||
| test_naughty_strings.py | ||
| test_tokenizer.py | ||
| test_urls.py | ||
| test_whitespace.py | ||