mirror of
https://github.com/explosion/spaCy.git
synced 2024-12-25 09:26:27 +03:00
de69bc6509
* match domains longer than `hostname.domain.tld` like `www.foo.co.uk` * expand allowed characters in domain names while only matching lowercase TLDs so that "this.That" isn't matched as a URL and can be split on the period as an infix (relevant for at least English, German, and Tatar) |
||
---|---|---|
.. | ||
__init__.py | ||
sun.txt | ||
test_exceptions.py | ||
test_explain.py | ||
test_naughty_strings.py | ||
test_tokenizer.py | ||
test_urls.py | ||
test_whitespace.py |