spaCy

mirror of https://github.com/explosion/spaCy.git synced 2025-03-27 13:24:13 +03:00

History

Duygu Altinok 0e55f806dd Turkish tokenization improvements (#6268 ) * added single and paired orth variants * added token match * added long text tokenization test * inverted init * normalized lemmas to lowercase * more abbrevs * tests for ordinals and abbrevs * separated period abbvrevs to another list * fiex typo * added ordinal and abbrev tests * added number tests for dates * minor refinement * added inflected abbrevs regex * added percentage and inflection * cosmetics * added token match * added url inflection tests * excluded url tokens from custom pattern * removed url match import		2020-10-29 09:43:17 +01:00
..
__init__.py	Turkish tokenization improvements (#6268 )	2020-10-29 09:43:17 +01:00
examples.py	💫 Tidy up and auto-format .py files (#2983 )	2018-11-30 17:03:03 +01:00
lex_attrs.py	Ordinal numbers for Turkish (#6142 )	2020-10-07 10:25:37 +02:00
morph_rules.py	Turkish tag map and morph rules addition (#6141 )	2020-10-07 10:27:36 +02:00
stop_words.py	💫 Tidy up and auto-format .py files (#2983 )	2018-11-30 17:03:03 +01:00
syntax_iterators.py	Turkish language syntax iterators (#6191 )	2020-10-07 11:07:52 +02:00
tokenizer_exceptions.py	Turkish tokenization improvements (#6268 )	2020-10-29 09:43:17 +01:00