mirror of
https://github.com/explosion/spaCy.git
synced 2025-10-27 22:21:08 +03:00
* Add Romanian lemmatizer lookup table. Adapted from http://www.lexiconista.com/datasets/lemmatization/ by replacing cedillas with commas (ș and ț). The original dataset is licensed under the Open Database License. * Fix one blatant issue in the Romanian lemmatizer * Romanian examples file * Add ro_tokenizer in conftest * Add Romanian lemmatizer test |
||
|---|---|---|
| .. | ||
| __init__.py | ||
| examples.py | ||
| lemmatizer.py | ||
| stop_words.py | ||
| tokenizer_exceptions.py | ||