spaCy/spacy/lang/uk
Adriane Boyd 30d31fd335
Update Russian and Ukrainian lemmatizers (#11811)
* pymorph2 issues #11620, #11626, #11625:
- #11620: pymorphy2_lookup
- #11626: handle multiple forms pointing to the same normal form + handling empty POS tag
- #11625: matching DET that are labelled as PRON by pymorhp2

* Move lemmatizer algorithm changes back into RussianLemmatizer

* Fix uk pymorphy3_lookup mode init

* Move and update tests for ru/uk lookup lemmatizer modes

* Fix typo

* Remove traces of previous behavior for uninflected POS

* Refactor to private generic-looking pymorphy methods

* Remove xfailed uk lemmatizer cases

* Update spacy/lang/ru/lemmatizer.py

Co-authored-by: Richard Hudson <richard@explosion.ai>

Co-authored-by: Dmytro S Lituiev <d.lituiev@gmail.com>
Co-authored-by: Richard Hudson <richard@explosion.ai>
2022-11-25 11:12:46 +01:00
..
__init__.py Switch ru and uk lemmatizers to pymorphy3 (#11345) 2022-08-22 11:27:14 +02:00
examples.py Tidy up and auto-format 2020-02-18 15:38:18 +01:00
lemmatizer.py Update Russian and Ukrainian lemmatizers (#11811) 2022-11-25 11:12:46 +01:00
lex_attrs.py Drop Python 2.7 and 3.5 (#4828) 2019-12-22 01:53:56 +01:00
stop_words.py Drop Python 2.7 and 3.5 (#4828) 2019-12-22 01:53:56 +01:00
tokenizer_exceptions.py Update Ukrainian tokenizer_exceptions 2022-02-01 13:24:00 +02:00