spaCy/spacy/tests/lang
Paul O'Leary McCann 29a9e636eb Fix half-width space handling in JA (#4284) (closes #4262)
Before this patch, half-width spaces between words were simply lost in
Japanese text. This wasn't immediately noticeable because much Japanese
text never uses spaces at all.
2019-09-13 16:28:12 +02:00
..
ar Tidy up and format remaining files 2018-11-30 17:43:08 +01:00
bn 💫 Port master changes over to develop (#2979) 2018-11-29 16:30:29 +01:00
ca Tidy up and auto-format 2019-08-20 17:36:34 +02:00
da Make Danish tokenizer split on forward slash 2019-07-12 15:20:42 +02:00
de Tidy up and auto-format 2019-08-20 17:36:34 +02:00
el 💫 Tidy up and auto-format tests (#2967) 2018-11-27 01:09:36 +01:00
en Allow period as suffix following punctuation (#4248) 2019-09-09 19:19:22 +02:00
es 💫 Tidy up and auto-format tests (#2967) 2018-11-27 01:09:36 +01:00
fi 💫 Tidy up and auto-format tests (#2967) 2018-11-27 01:09:36 +01:00
fr Tidy up and auto-format 2019-08-20 17:36:34 +02:00
ga 💫 Tidy up and auto-format tests (#2967) 2018-11-27 01:09:36 +01:00
he 💫 Tidy up and auto-format tests (#2967) 2018-11-27 01:09:36 +01:00
hr adds Croatian lemma_lookup.json, license file and corresponding tests (#4252) 2019-09-08 13:40:45 +02:00
hu Tidy up and format remaining files 2018-11-30 17:43:08 +01:00
id 💫 Tidy up and auto-format tests (#2967) 2018-11-27 01:09:36 +01:00
it Improve Italian & Urdu tokenization accuracy (#3228) 2019-02-04 22:39:25 +01:00
ja Fix half-width space handling in JA (#4284) (closes #4262) 2019-09-13 16:28:12 +02:00
ko Fix ValueError exception on empty Korean text. (#4245) 2019-09-06 10:29:40 +02:00
lt Bloom-filter backed Lookup Tables (#4268) 2019-09-12 17:26:11 +02:00
nb 💫 Tidy up and auto-format tests (#2967) 2018-11-27 01:09:36 +01:00
nl Bloom-filter backed Lookup Tables (#4268) 2019-09-12 17:26:11 +02:00
pl Tidy up and auto-format 2019-08-20 17:36:34 +02:00
pt 💫 Tidy up and auto-format tests (#2967) 2018-11-27 01:09:36 +01:00
ro 💫 Tidy up and auto-format tests (#2967) 2018-11-27 01:09:36 +01:00
ru Replacing regex library with re to increase tokenization speed (#3218) 2019-02-01 18:05:22 +11:00
sr Lemmatizer lookup dictionary for Serbian and basic tag set adde… (#4251) 2019-09-08 14:19:15 +02:00
sv Tidy up and fix small bugs and typos 2019-02-08 14:14:49 +01:00
th 💫 Tidy up and auto-format tests (#2967) 2018-11-27 01:09:36 +01:00
tr 💫 Tidy up and auto-format tests (#2967) 2018-11-27 01:09:36 +01:00
tt 💫 Tidy up and auto-format tests (#2967) 2018-11-27 01:09:36 +01:00
uk Merge branch 'master' into develop 2019-02-25 15:54:55 +01:00
ur Tidy up and auto-format 2019-08-20 17:36:34 +02:00
__init__.py Remove imports in /lang/__init__.py 2017-05-08 23:58:07 +02:00
test_attrs.py 💫 Tidy up and auto-format tests (#2967) 2018-11-27 01:09:36 +01:00
test_initialize.py Serbian language code update "rs" -> "sr" (#4159) 2019-08-21 19:57:37 +02:00