..
af
New tests for a number of alpha languages ( #9703 )
2021-11-28 21:59:23 +01:00
am
Tidy up and auto-format
2021-01-15 11:57:36 +11:00
ar
Remove POS, TAG and LEMMA from tokenizer exceptions
2020-07-22 23:09:01 +02:00
bg
Handle Cyrillic combining diacritics ( #10837 )
2022-06-28 15:35:32 +02:00
bn
Drop Python 2.7 and 3.5 ( #4828 )
2019-12-22 01:53:56 +01:00
ca
Update Catalan tokenizer ( #9297 )
2021-09-27 14:42:30 +02:00
cs
Remove unicode declarations and update language data
2020-09-04 13:19:16 +02:00
da
Merge remote-tracking branch 'upstream/master' into chore/update-develop-from-master-rc3
2021-01-14 11:49:58 +01:00
de
Tidy up and auto-format
2020-09-29 21:39:28 +02:00
dsb
Add Lower Sorbian support. ( #10431 )
2022-03-07 16:57:14 +01:00
el
Tidy up and auto-format
2020-09-29 21:39:28 +02:00
en
Remove English exceptions with mismatched features ( #10873 )
2022-06-03 09:44:04 +02:00
es
Migrate regression tests into the main test suite ( #9655 )
2021-12-04 20:34:48 +01:00
et
New tests for a number of alpha languages ( #9703 )
2021-11-28 21:59:23 +01:00
eu
Merge branch 'develop' into master-tmp
2020-05-21 18:39:06 +02:00
fa
Tidy up and auto-format
2020-09-29 21:39:28 +02:00
fi
Auto-format code with black ( #10333 )
2022-02-21 09:15:42 +01:00
fr
Revert "Bump sudachipy version ( #9917 )" ( #10071 )
2022-01-17 10:38:37 +01:00
ga
Drop Python 2.7 and 3.5 ( #4828 )
2019-12-22 01:53:56 +01:00
grc
add punctuation to grc ( #11426 )
2022-09-27 11:38:56 +02:00
gu
Remove unicode declarations and tidy up
2020-06-21 22:34:10 +02:00
he
Merge branch 'develop' into master-tmp
2020-09-04 13:15:36 +02:00
hi
Migrate regression tests into the main test suite ( #9655 )
2021-12-04 20:34:48 +01:00
hr
New tests for a number of alpha languages ( #9703 )
2021-11-28 21:59:23 +01:00
hsb
Add Upper Sorbian support. ( #10432 )
2022-03-07 16:20:39 +01:00
hu
🏷 Add Mypy check to CI and ignore all existing Mypy errors ( #9167 )
2021-10-14 15:21:40 +02:00
hy
Remove unicode declarations and tidy up
2020-06-21 22:34:10 +02:00
id
Tidy up and auto-format
2020-09-29 21:39:28 +02:00
is
New tests for a number of alpha languages ( #9703 )
2021-11-28 21:59:23 +01:00
it
Revert "Bump sudachipy version ( #9917 )" ( #10071 )
2022-01-17 10:38:37 +01:00
ja
Migrate regression tests into the main test suite ( #9655 )
2021-12-04 20:34:48 +01:00
ko
Handle unknown tags in KoreanTokenizer tag map ( #10536 )
2022-03-24 11:25:36 +01:00
ky
Update Cython string types ( #9143 )
2021-09-13 17:02:17 +02:00
la
Update LatinDefaults for lang 'la' ( #12538 )
2023-04-20 16:55:40 +02:00
lb
Remove POS, TAG and LEMMA from tokenizer exceptions
2020-07-22 23:09:01 +02:00
lg
luganda language extension ( #10847 )
2022-08-23 13:09:36 +02:00
lt
Merge branch 'master' into tmp/sync
2020-03-26 13:38:14 +01:00
lv
New tests for a number of alpha languages ( #9703 )
2021-11-28 21:59:23 +01:00
mk
Tidy up and auto-format
2021-01-05 13:41:53 +11:00
ml
Remove unicode declarations and tidy up
2020-06-21 22:34:10 +02:00
nb
Tidy up and auto-format
2020-09-29 21:39:28 +02:00
ne
Tidy up and auto-format
2020-09-29 21:39:28 +02:00
nl
Fix Dutch noun chunks to skip overlapping spans ( #11275 )
2022-08-10 09:49:08 +02:00
pl
Merge branch 'develop' into master-tmp
2020-05-21 18:39:06 +02:00
pt
Portuguese noun chunks review ( #9559 )
2021-11-04 23:55:49 +01:00
ro
Drop Python 2.7 and 3.5 ( #4828 )
2019-12-22 01:53:56 +01:00
ru
Update Russian and Ukrainian lemmatizers ( #11811 )
2022-11-25 11:12:46 +01:00
sa
Tidy up and auto-format
2020-09-29 21:39:28 +02:00
sk
New tests for a number of alpha languages ( #9703 )
2021-11-28 21:59:23 +01:00
sl
Updates to Slovenian language ( #11162 )
2022-08-05 10:10:18 +02:00
sq
New tests for a number of alpha languages ( #9703 )
2021-11-28 21:59:23 +01:00
sr
Use Latin normalization for Serbian attrs ( #12608 )
2023-05-08 12:33:56 +02:00
sv
Bugfix/swedish tokenizer ( #12315 )
2023-02-27 10:53:45 +01:00
ta
Basic tests for the Tamil language ( #10629 )
2022-04-07 14:47:37 +02:00
th
Update custom tokenizer APIs and pickling ( #8972 )
2021-08-19 14:37:47 +02:00
ti
Update Tigrinya ትግርኛ language support ( #8900 )
2021-08-10 13:55:08 +02:00
tl
Add initial Tagalog (tl) tests ( #9582 )
2021-11-02 08:35:49 +01:00
tr
removing print statements from the test suite ( #10712 )
2022-04-27 09:14:25 +02:00
tt
Merge branch 'master' into develop
2020-02-18 14:47:23 +01:00
uk
Update Russian and Ukrainian lemmatizers ( #11811 )
2022-11-25 11:12:46 +01:00
ur
Drop Python 2.7 and 3.5 ( #4828 )
2019-12-22 01:53:56 +01:00
vi
Update custom tokenizer APIs and pickling ( #8972 )
2021-08-19 14:37:47 +02:00
xx
New tests for a number of alpha languages ( #9703 )
2021-11-28 21:59:23 +01:00
yo
Drop Python 2.7 and 3.5 ( #4828 )
2019-12-22 01:53:56 +01:00
zh
Tidy up and auto-format
2020-10-03 17:20:18 +02:00
__init__.py
Revert #4334
2019-09-29 17:32:12 +02:00
test_attrs.py
Intify IOB ( #9738 )
2022-01-20 13:19:38 +01:00
test_initialize.py
Fix Azerbaijani init, extend lang init tests ( #8656 )
2021-07-09 15:36:35 +02:00
test_lemmatizers.py
Update Catalan language data ( #8308 )
2021-06-11 10:21:22 +02:00