spaCy/spacy/lang/ro
Jani Monoses 0e08e49e87 Lemmatizer ro (#2319)
* Add Romanian lemmatizer lookup table.

Adapted from http://www.lexiconista.com/datasets/lemmatization/
by replacing cedillas with commas (ș and ț).

The original dataset is licensed under the Open Database License.

* Fix one blatant issue in the Romanian lemmatizer

* Romanian examples file

* Add ro_tokenizer in conftest

* Add Romanian lemmatizer test
2018-05-12 15:20:04 +02:00
..
__init__.py Lemmatizer ro (#2319) 2018-05-12 15:20:04 +02:00
examples.py Lemmatizer ro (#2319) 2018-05-12 15:20:04 +02:00
lemmatizer.py Lemmatizer ro (#2319) 2018-05-12 15:20:04 +02:00
stop_words.py Update Romanian stopword list (#2316) 2018-05-10 12:16:56 +02:00
tokenizer_exceptions.py Add Romanian and Croatian skeletons (experimental) 2017-11-01 23:04:28 +01:00