..
am
Add Amharic አማርኛ Language support ( #6583 )
2020-12-22 16:50:34 +01:00
ar
Revert #4334
2019-09-29 17:32:12 +02:00
bn
Revert #4334
2019-09-29 17:32:12 +02:00
ca
Revert #4334
2019-09-29 17:32:12 +02:00
cs
Adding num_like test for Czech ( #5946 )
2020-08-21 17:06:33 +02:00
da
Add (noun chunks) syntax iterators for Danish ( #6246 )
2021-01-07 16:33:00 +11:00
de
Tidy up and auto-format
2020-05-21 14:14:01 +02:00
el
Tidy up and auto-format
2020-05-21 14:14:01 +02:00
en
Cast to uint64 for all array-based doc representations ( #11940 )
2022-12-15 08:16:14 +01:00
es
Tidy up and auto-format
2020-05-21 14:14:01 +02:00
eu
Add __init__.py to eu and hy tests ( #5278 )
2020-04-08 20:03:06 +02:00
fa
Fix syntax iterators for Persian ( #5437 )
2020-05-14 16:51:03 +02:00
fi
add two abbreviations and some additional unit tests ( #5040 )
2020-02-22 14:12:32 +01:00
fr
Tidy up and auto-format
2020-05-21 14:14:01 +02:00
ga
Revert #4334
2019-09-29 17:32:12 +02:00
gu
Tidy up and auto-format
2020-05-21 14:14:01 +02:00
he
Hebrew like num ( #5952 )
2020-08-24 14:30:05 +02:00
hi
Hindi: Adds tests for lexical attributes (norm and like_num) ( #5829 )
2020-10-07 10:23:32 +02:00
hu
Tidy up and auto-format
2020-03-25 12:28:12 +01:00
hy
Add missing declaration
2020-05-21 17:30:05 +02:00
id
Tidy up and auto-format
2020-05-21 14:14:01 +02:00
it
Revert #4334
2019-09-29 17:32:12 +02:00
ja
Revert "Convert custom user_data to token extension format for Japanese tokenizer ( #5652 )" ( #5665 )
2020-06-29 14:34:15 +02:00
ko
Revert #4334
2019-09-29 17:32:12 +02:00
ky
Add tests
2021-01-24 20:56:16 +06:00
lb
Reduce stored lexemes data, move feats to lookups ( #5238 )
2020-05-19 15:59:14 +02:00
lt
Improve Lithuanian tokenization ( #5205 )
2020-03-25 11:28:12 +01:00
mk
Include Macedonian language ( #6230 )
2020-10-15 15:55:01 +02:00
ml
Tidy up and auto-format
2020-05-21 14:14:01 +02:00
nb
Tidy up and auto-format
2020-05-21 14:14:01 +02:00
ne
Add Nepali Language ( #5622 )
2020-06-22 10:25:46 +02:00
nl
Move lookup tables out of the core library ( #4346 )
2019-10-01 00:01:27 +02:00
pl
Update Polish tokenizer for UD_Polish-PDB ( #5432 )
2020-05-19 15:59:55 +02:00
pt
Revert #4334
2019-09-29 17:32:12 +02:00
ro
Move lookup tables out of the core library ( #4346 )
2019-10-01 00:01:27 +02:00
ru
Revert #4334
2019-09-29 17:32:12 +02:00
sa
Added support for Sanskrit language ( #5956 )
2020-08-25 10:56:29 +02:00
sr
Move lookup tables out of the core library ( #4346 )
2019-10-01 00:01:27 +02:00
sv
Tidy up and auto-format
2020-05-21 14:14:01 +02:00
th
Revert #4334
2019-09-29 17:32:12 +02:00
ti
Add Amharic አማርኛ Language support ( #6583 )
2020-12-22 16:50:34 +01:00
tr
Turkish tokenization improvements ( #6268 )
2020-10-29 09:43:17 +01:00
tt
Add trailing whitespace to multiline test text ( #4877 )
2020-01-06 14:58:59 +01:00
uk
Revert #4334
2019-09-29 17:32:12 +02:00
ur
Revert #4334
2019-09-29 17:32:12 +02:00
yo
Adding support for Yoruba Language ( #4614 )
2019-12-21 14:11:50 +01:00
zh
Tidy up and auto-format
2020-05-21 14:14:01 +02:00
__init__.py
Revert #4334
2019-09-29 17:32:12 +02:00
test_attrs.py
Tidy up and auto-format
2019-12-21 19:04:17 +01:00
test_initialize.py
Adding support for Yoruba Language ( #4614 )
2019-12-21 14:11:50 +01:00