mirror of
https://github.com/explosion/spaCy.git
synced 2024-12-27 10:26:35 +03:00
0e08e49e87
* Add Romanian lemmatizer lookup table. Adapted from http://www.lexiconista.com/datasets/lemmatization/ by replacing cedillas with commas (ș and ț). The original dataset is licensed under the Open Database License. * Fix one blatant issue in the Romanian lemmatizer * Romanian examples file * Add ro_tokenizer in conftest * Add Romanian lemmatizer test |
||
---|---|---|
.. | ||
__init__.py | ||
examples.py | ||
lemmatizer.py | ||
stop_words.py | ||
tokenizer_exceptions.py |