.. |
ar
|
Additions to Arabic stop words. (#2422)
|
2018-06-08 02:33:23 +02:00 |
bn
|
update bengali token rules for hyphen and digits (#2731)
|
2018-09-05 21:49:00 +02:00 |
da
|
Add Danish lemmatizer (#2184)
|
2018-04-07 19:07:28 +02:00 |
de
|
Also include lowercase norm exceptions
|
2018-10-13 15:37:30 +02:00 |
el
|
Optimize Greek language support (#2658)
|
2018-08-14 02:31:32 +02:00 |
en
|
quick typo fix
|
2018-03-24 17:26:35 +01:00 |
es
|
Fix Spanish noun_chunks (resolves #2210)
|
2018-04-18 18:44:01 -04:00 |
fa
|
Add Persian(Farsi) language support (#2797)
|
2018-10-13 15:31:49 +02:00 |
fi
|
Enhancement/lang fi examples (#2547)
|
2018-07-15 09:50:27 +02:00 |
fr
|
Rule-based French Lemmatizer (#2818)
|
2018-10-13 16:38:21 +02:00 |
ga
|
Remove comma that caused list to wrap in tuple!
|
2017-10-31 20:13:16 +01:00 |
he
|
Don't make copies of language data components
|
2017-10-11 15:34:55 +02:00 |
hi
|
Added numbers to ../lang/hi/lex_attrs.py (#2629)
|
2018-08-08 16:06:11 +02:00 |
hr
|
Update stop_words.py
|
2018-03-24 17:31:24 +01:00 |
hu
|
Don't copy exception dicts if not necessary and tidy up
|
2017-10-31 21:05:29 +01:00 |
id
|
Update Indonesian model (#2752)
|
2018-09-14 12:30:32 +02:00 |
it
|
Fix syntax error in italian lemmatizer
|
2018-04-03 23:13:22 +02:00 |
ja
|
Add Japanese stop words. (#2549)
|
2018-07-17 10:12:48 +02:00 |
nb
|
changed tag_map, morph_rules, lemmatizer for Norwegian (#2565)
|
2018-07-19 19:38:24 +02:00 |
nl
|
Fix typo [ci skip]
|
2018-07-24 18:45:40 +02:00 |
pl
|
Lex _attrs for polish language (#2750)
|
2018-09-10 11:53:57 +02:00 |
pt
|
Update Portuguese Language (#2790)
|
2018-09-29 09:51:45 +02:00 |
ro
|
Updates to Romanian support (#2354)
|
2018-05-24 11:40:00 +02:00 |
ru
|
Correcting lang/ru/examples.py (#2845)
|
2018-10-13 15:19:43 +02:00 |
si
|
Adding "This is a sentence" example to Sinhala (#2846)
|
2018-10-14 00:06:40 +02:00 |
sv
|
Add abbreviations from UD_Swedish-Talbanken (#2613)
|
2018-08-07 13:53:17 +02:00 |
te
|
Basic support for Telugu language (#2751)
|
2018-09-10 11:53:18 +02:00 |
th
|
Don't copy exception dicts if not necessary and tidy up
|
2017-10-31 21:05:29 +01:00 |
tr
|
Port over Turkish changes
|
2018-03-24 17:31:07 +01:00 |
tt
|
Add Tatar Language Support (#2444)
|
2018-06-19 10:17:53 +02:00 |
ur
|
Add Urdu Language Support (#2430)
|
2018-06-22 11:14:03 +02:00 |
vi
|
Add support for Vietnamese in spaCy by leveraging Pyvi, an external Vietnamese tokenizer (#2155)
|
2018-03-29 12:19:51 +02:00 |
xx
|
Tidy up language data
|
2017-10-11 02:22:49 +02:00 |
zh
|
Fix Chinese language related bugs (#2634)
|
2018-08-07 11:26:31 +02:00 |
__init__.py
|
Remove imports in /lang/__init__.py
|
2017-05-08 23:58:07 +02:00 |
char_classes.py
|
Adding basic support for Sinhala language. (#2788)
|
2018-09-25 12:18:25 +02:00 |
entity_rules.py
|
Reorganise entity rules
|
2017-05-09 01:37:10 +02:00 |
lex_attrs.py
|
Merge pull request #1891 from fucking-signup/master
|
2018-02-18 13:47:47 +01:00 |
norm_exceptions.py
|
Update base norm exceptions with more unicode characters
|
2017-10-14 14:58:52 +02:00 |
punctuation.py
|
Add symbols class to punctuation rules to handle emoji (see #1088)
|
2017-05-27 17:57:10 +02:00 |
tag_map.py
|
Fix formatting
|
2017-05-09 11:08:14 +02:00 |
tokenizer_exceptions.py
|
Tidy up tokenizer exceptions
|
2017-11-01 23:02:45 +01:00 |