spaCy/tests/regression/test_issue4104.py

# coding: utf8
from __future__ import unicode_literals

from ..util import get_doc


def test_issue4104(en_vocab):
    """Test that English lookup lemmatization of spun & dry are correct
    expected mapping = {'dry': 'dry', 'spun': 'spin', 'spun-dry': 'spin-dry'}
    """
    text = "dry spun spun-dry"
    doc = get_doc(en_vocab, [t for t in text.split(" ")])
    # using a simple list to preserve order
    expected = ["dry", "spin", "spin-dry"]
    assert [token.lemma_ for token in doc] == expected
Correction of default lemmatizer lookup in English (Issue # 4104) (#4110) * pytest file for issue4104 established * edited default lookup english lemmatizer for spun; fixes issue 4102 * eliminated parameterization and sorted dictionary dependnency in issue 4104 test * added contributor agreement 2019-08-15 12:39:10 +03:00			`# coding: utf8`
			`from __future__ import unicode_literals`

			`from ..util import get_doc`

Tidy up and auto-format 2019-08-18 16:09:16 +03:00
Correction of default lemmatizer lookup in English (Issue # 4104) (#4110) * pytest file for issue4104 established * edited default lookup english lemmatizer for spun; fixes issue 4102 * eliminated parameterization and sorted dictionary dependnency in issue 4104 test * added contributor agreement 2019-08-15 12:39:10 +03:00			`def test_issue4104(en_vocab):`
			`"""Test that English lookup lemmatization of spun & dry are correct`
			`expected mapping = {'dry': 'dry', 'spun': 'spin', 'spun-dry': 'spin-dry'}`
Tidy up and auto-format 2019-08-18 16:09:16 +03:00			`"""`
			`text = "dry spun spun-dry"`
Correction of default lemmatizer lookup in English (Issue # 4104) (#4110) * pytest file for issue4104 established * edited default lookup english lemmatizer for spun; fixes issue 4102 * eliminated parameterization and sorted dictionary dependnency in issue 4104 test * added contributor agreement 2019-08-15 12:39:10 +03:00			`doc = get_doc(en_vocab, [t for t in text.split(" ")])`
			`# using a simple list to preserve order`
Tidy up and auto-format 2019-08-18 16:09:16 +03:00			`expected = ["dry", "spin", "spin-dry"]`
Correction of default lemmatizer lookup in English (Issue # 4104) (#4110) * pytest file for issue4104 established * edited default lookup english lemmatizer for spun; fixes issue 4102 * eliminated parameterization and sorted dictionary dependnency in issue 4104 test * added contributor agreement 2019-08-15 12:39:10 +03:00			`assert [token.lemma_ for token in doc] == expected`