spaCy/spacy/tests/lang/tr
Duygu Altinok 0e55f806dd
Turkish tokenization improvements (#6268)
* added single and paired orth variants

* added token match

* added long text tokenization test

* inverted init

* normalized lemmas to lowercase

* more abbrevs

* tests for ordinals and abbrevs

* separated period abbvrevs to another list

* fiex typo

* added ordinal and abbrev tests

* added number tests for dates

* minor refinement

* added inflected abbrevs regex

* added percentage and inflection

* cosmetics

* added token match

* added url inflection tests

* excluded url tokens from custom pattern

* removed url match import
2020-10-29 09:43:17 +01:00
..
__init__.py Revert #4334 2019-09-29 17:32:12 +02:00
test_noun_chunks.py Turkish language syntax iterators (#6191) 2020-10-07 11:07:52 +02:00
test_parser.py Turkish language syntax iterators (#6191) 2020-10-07 11:07:52 +02:00
test_text.py Turkish tokenization improvements (#6268) 2020-10-29 09:43:17 +01:00
test_tokenizer.py Turkish tokenization improvements (#6268) 2020-10-29 09:43:17 +01:00