spaCy/spacy/tests/lang
Jani Monoses 0e08e49e87 Lemmatizer ro (#2319)
* Add Romanian lemmatizer lookup table.

Adapted from http://www.lexiconista.com/datasets/lemmatization/
by replacing cedillas with commas (ș and ț).

The original dataset is licensed under the Open Database License.

* Fix one blatant issue in the Romanian lemmatizer

* Romanian examples file

* Add ro_tokenizer in conftest

* Add Romanian lemmatizer test
2018-05-12 15:20:04 +02:00
..
bn Move language-specific tests to tests/lang 2017-05-09 00:02:37 +02:00
da Add Danish lemmatizer (#2184) 2018-04-07 19:07:28 +02:00
de Add German lemmatizer tests 2017-10-11 13:27:26 +02:00
en Drop six and related hacks as a dependency 2018-03-28 10:45:25 +02:00
es Move language-specific tests to tests/lang 2017-05-09 00:02:37 +02:00
fi Move language-specific tests to tests/lang 2017-05-09 00:02:37 +02:00
fr Fix French test (see #1617) 2017-11-20 13:59:59 +01:00
ga merge 2017-10-31 22:55:59 +00:00
he Move language-specific tests to tests/lang 2017-05-09 00:02:37 +02:00
hu Update tests 2017-06-05 02:09:27 +02:00
id added {pre,suf,in}fix tests 2017-08-20 13:43:00 +07:00
ja Port Japanese mecab tokenizer from v1 (#2036) 2018-05-03 18:38:26 +02:00
nb Move language-specific tests to tests/lang 2017-05-09 00:02:37 +02:00
ro Lemmatizer ro (#2319) 2018-05-12 15:20:04 +02:00
ru Added tag map, fixed tests fails, added more exceptions 2017-11-26 20:54:48 +03:00
sv Move language-specific tests to tests/lang 2017-05-09 00:02:37 +02:00
th add thai in spacy2 2017-09-26 21:36:27 +07:00
tr Adds Turkish Lemmatization 2017-12-01 17:04:32 +03:00
__init__.py Remove imports in /lang/__init__.py 2017-05-08 23:58:07 +02:00
test_attrs.py added lex test for is_currency 2018-02-11 18:50:50 +01:00