spaCy/spacy/tests/lang/uk/test_lemmatizer.py

import pytest
from spacy.tokens import Doc


pytestmark = pytest.mark.filterwarnings("ignore::DeprecationWarning")


def test_uk_lemmatizer(uk_lemmatizer):
    """Check that the default uk lemmatizer runs."""
    doc = Doc(uk_lemmatizer.vocab, words=["a", "b", "c"])
    assert uk_lemmatizer.mode == "pymorphy3"
    uk_lemmatizer(doc)
    assert [token.lemma for token in doc]


@pytest.mark.parametrize(
    "word,lemma",
    (
        ("якийсь", "якийсь"),
        ("розповідають", "розповідати"),
        ("розповіси", "розповісти"),
    ),
)
def test_uk_lookup_lemmatizer(uk_lookup_lemmatizer, word, lemma):
    assert uk_lookup_lemmatizer.mode == "pymorphy3_lookup"
    doc = Doc(uk_lookup_lemmatizer.vocab, words=[word])
    assert uk_lookup_lemmatizer(doc)[0].lemma_ == lemma
Clean up warnings in the test suite (#11331) 2022-08-22 13:04:30 +03:00			`import pytest`
Restrict pymorphy2 requirement to pymorphy2 mode (#8299) For the Russian and Ukrainian lemmatizers, restrict the `pymorphy2` requirement to the mode `pymorphy2` so that lookup or other lemmatizer modes can be loaded without installing `pymorphy2`. 2021-06-11 11:19:22 +03:00			`from spacy.tokens import Doc`


Clean up warnings in the test suite (#11331) 2022-08-22 13:04:30 +03:00			`pytestmark = pytest.mark.filterwarnings("ignore::DeprecationWarning")`


Restrict pymorphy2 requirement to pymorphy2 mode (#8299) For the Russian and Ukrainian lemmatizers, restrict the `pymorphy2` requirement to the mode `pymorphy2` so that lookup or other lemmatizer modes can be loaded without installing `pymorphy2`. 2021-06-11 11:19:22 +03:00			`def test_uk_lemmatizer(uk_lemmatizer):`
			`"""Check that the default uk lemmatizer runs."""`
			`doc = Doc(uk_lemmatizer.vocab, words=["a", "b", "c"])`
Update Russian and Ukrainian lemmatizers (#11811) * pymorph2 issues #11620, #11626, #11625: - #11620: pymorphy2_lookup - #11626: handle multiple forms pointing to the same normal form + handling empty POS tag - #11625: matching DET that are labelled as PRON by pymorhp2 * Move lemmatizer algorithm changes back into RussianLemmatizer * Fix uk pymorphy3_lookup mode init * Move and update tests for ru/uk lookup lemmatizer modes * Fix typo * Remove traces of previous behavior for uninflected POS * Refactor to private generic-looking pymorphy methods * Remove xfailed uk lemmatizer cases * Update spacy/lang/ru/lemmatizer.py Co-authored-by: Richard Hudson <richard@explosion.ai> Co-authored-by: Dmytro S Lituiev <d.lituiev@gmail.com> Co-authored-by: Richard Hudson <richard@explosion.ai> 2022-11-25 13:12:46 +03:00			`assert uk_lemmatizer.mode == "pymorphy3"`
Restrict pymorphy2 requirement to pymorphy2 mode (#8299) For the Russian and Ukrainian lemmatizers, restrict the `pymorphy2` requirement to the mode `pymorphy2` so that lookup or other lemmatizer modes can be loaded without installing `pymorphy2`. 2021-06-11 11:19:22 +03:00			`uk_lemmatizer(doc)`
Fix init for pymorphy2_lookup lemmatizer mode (#11631) 2022-10-12 13:18:39 +03:00			`assert [token.lemma for token in doc]`


Update Russian and Ukrainian lemmatizers (#11811) * pymorph2 issues #11620, #11626, #11625: - #11620: pymorphy2_lookup - #11626: handle multiple forms pointing to the same normal form + handling empty POS tag - #11625: matching DET that are labelled as PRON by pymorhp2 * Move lemmatizer algorithm changes back into RussianLemmatizer * Fix uk pymorphy3_lookup mode init * Move and update tests for ru/uk lookup lemmatizer modes * Fix typo * Remove traces of previous behavior for uninflected POS * Refactor to private generic-looking pymorphy methods * Remove xfailed uk lemmatizer cases * Update spacy/lang/ru/lemmatizer.py Co-authored-by: Richard Hudson <richard@explosion.ai> Co-authored-by: Dmytro S Lituiev <d.lituiev@gmail.com> Co-authored-by: Richard Hudson <richard@explosion.ai> 2022-11-25 13:12:46 +03:00			`@pytest.mark.parametrize(`
			`"word,lemma",`
			`(`
			`("якийсь", "якийсь"),`
			`("розповідають", "розповідати"),`
			`("розповіси", "розповісти"),`
			`),`
			`)`
			`def test_uk_lookup_lemmatizer(uk_lookup_lemmatizer, word, lemma):`
			`assert uk_lookup_lemmatizer.mode == "pymorphy3_lookup"`
			`doc = Doc(uk_lookup_lemmatizer.vocab, words=[word])`
			`assert uk_lookup_lemmatizer(doc)[0].lemma_ == lemma`