spaCy/spacy/lang
Jeff Adolphe 41e07772dc
Added Haitian Creole (ht) Language Support to spaCy (#13807)
This PR adds official support for Haitian Creole (ht) to spaCy's spacy/lang module.
It includes:

    Added all core language data files for spacy/lang/ht:
        tokenizer_exceptions.py
        punctuation.py
        lex_attrs.py
        syntax_iterators.py
        lemmatizer.py
        stop_words.py
        tag_map.py

    Unit tests for tokenizer and noun chunking (test_tokenizer.py, test_noun_chunking.py, etc.). Passed all 58 pytest spacy/tests/lang/ht tests that I've created.

    Basic tokenizer rules adapted for Haitian Creole orthography and informal contractions.

    Custom like_num atrribute supporting Haitian number formats (e.g., "3yèm").

    Support for common informal apostrophe usage (e.g., "m'ap", "n'ap", "di'm").

    Ensured no breakages in other language modules.

    Followed spaCy coding style (PEP8, Black).

This provides a foundation for Haitian Creole NLP development using spaCy.
2025-05-28 17:23:38 +02:00
..
af Configure isort to use the Black profile, recursively isort the spacy module (#12721) 2023-06-14 17:48:41 +02:00
am Configure isort to use the Black profile, recursively isort the spacy module (#12721) 2023-06-14 17:48:41 +02:00
ar Configure isort to use the Black profile, recursively isort the spacy module (#12721) 2023-06-14 17:48:41 +02:00
az Configure isort to use the Black profile, recursively isort the spacy module (#12721) 2023-06-14 17:48:41 +02:00
bg Configure isort to use the Black profile, recursively isort the spacy module (#12721) 2023-06-14 17:48:41 +02:00
bn Configure isort to use the Black profile, recursively isort the spacy module (#12721) 2023-06-14 17:48:41 +02:00
bo Format 2024-09-09 11:22:52 +02:00
ca Configure isort to use the Black profile, recursively isort the spacy module (#12721) 2023-06-14 17:48:41 +02:00
cs Configure isort to use the Black profile, recursively isort the spacy module (#12721) 2023-06-14 17:48:41 +02:00
da Configure isort to use the Black profile, recursively isort the spacy module (#12721) 2023-06-14 17:48:41 +02:00
de Configure isort to use the Black profile, recursively isort the spacy module (#12721) 2023-06-14 17:48:41 +02:00
dsb Configure isort to use the Black profile, recursively isort the spacy module (#12721) 2023-06-14 17:48:41 +02:00
el Configure isort to use the Black profile, recursively isort the spacy module (#12721) 2023-06-14 17:48:41 +02:00
en feat: add extra lexical attributes (#13106) 2023-11-08 17:29:11 +01:00
es Fix typo in lemmatizer.py (#13003) 2023-09-25 11:44:35 +02:00
et Configure isort to use the Black profile, recursively isort the spacy module (#12721) 2023-06-14 17:48:41 +02:00
eu Configure isort to use the Black profile, recursively isort the spacy module (#12721) 2023-06-14 17:48:41 +02:00
fa Configure isort to use the Black profile, recursively isort the spacy module (#12721) 2023-06-14 17:48:41 +02:00
fi Configure isort to use the Black profile, recursively isort the spacy module (#12721) 2023-06-14 17:48:41 +02:00
fo Feature/nn and fo language extensions (#13116) 2023-11-20 07:49:59 +01:00
fr Configure isort to use the Black profile, recursively isort the spacy module (#12721) 2023-06-14 17:48:41 +02:00
ga Configure isort to use the Black profile, recursively isort the spacy module (#12721) 2023-06-14 17:48:41 +02:00
gd Format 2024-09-09 11:22:52 +02:00
grc Add Left and Right Pointing Angle Brackets as punctuation to ancient Greek (#12829) 2023-07-20 11:16:01 +02:00
gu Configure isort to use the Black profile, recursively isort the spacy module (#12721) 2023-06-14 17:48:41 +02:00
he Configure isort to use the Black profile, recursively isort the spacy module (#12721) 2023-06-14 17:48:41 +02:00
hi Configure isort to use the Black profile, recursively isort the spacy module (#12721) 2023-06-14 17:48:41 +02:00
hr Fix misspelling (#13631) [ci skip] 2024-10-11 11:23:12 +02:00
hsb Configure isort to use the Black profile, recursively isort the spacy module (#12721) 2023-06-14 17:48:41 +02:00
ht Added Haitian Creole (ht) Language Support to spaCy (#13807) 2025-05-28 17:23:38 +02:00
hu Configure isort to use the Black profile, recursively isort the spacy module (#12721) 2023-06-14 17:48:41 +02:00
hy Configure isort to use the Black profile, recursively isort the spacy module (#12721) 2023-06-14 17:48:41 +02:00
id Configure isort to use the Black profile, recursively isort the spacy module (#12721) 2023-06-14 17:48:41 +02:00
is Configure isort to use the Black profile, recursively isort the spacy module (#12721) 2023-06-14 17:48:41 +02:00
it Configure isort to use the Black profile, recursively isort the spacy module (#12721) 2023-06-14 17:48:41 +02:00
ja Python 3.13 support (#13823) 2025-05-22 13:47:21 +02:00
kmr Fix memory zones 2024-09-09 13:49:41 +02:00
kn Configure isort to use the Black profile, recursively isort the spacy module (#12721) 2023-06-14 17:48:41 +02:00
ko Python 3.13 support (#13823) 2025-05-22 13:47:21 +02:00
ky Configure isort to use the Black profile, recursively isort the spacy module (#12721) 2023-06-14 17:48:41 +02:00
la Configure isort to use the Black profile, recursively isort the spacy module (#12721) 2023-06-14 17:48:41 +02:00
lb Configure isort to use the Black profile, recursively isort the spacy module (#12721) 2023-06-14 17:48:41 +02:00
lg Configure isort to use the Black profile, recursively isort the spacy module (#12721) 2023-06-14 17:48:41 +02:00
lij Configure isort to use the Black profile, recursively isort the spacy module (#12721) 2023-06-14 17:48:41 +02:00
lt Configure isort to use the Black profile, recursively isort the spacy module (#12721) 2023-06-14 17:48:41 +02:00
lv Configure isort to use the Black profile, recursively isort the spacy module (#12721) 2023-06-14 17:48:41 +02:00
mk Format 2024-09-14 11:29:40 +02:00
ml Configure isort to use the Black profile, recursively isort the spacy module (#12721) 2023-06-14 17:48:41 +02:00
mr Configure isort to use the Black profile, recursively isort the spacy module (#12721) 2023-06-14 17:48:41 +02:00
ms Configure isort to use the Black profile, recursively isort the spacy module (#12721) 2023-06-14 17:48:41 +02:00
nb Configure isort to use the Black profile, recursively isort the spacy module (#12721) 2023-06-14 17:48:41 +02:00
ne Configure isort to use the Black profile, recursively isort the spacy module (#12721) 2023-06-14 17:48:41 +02:00
nl Configure isort to use the Black profile, recursively isort the spacy module (#12721) 2023-06-14 17:48:41 +02:00
nn Feature/nn and fo language extensions (#13116) 2023-11-20 07:49:59 +01:00
pl Configure isort to use the Black profile, recursively isort the spacy module (#12721) 2023-06-14 17:48:41 +02:00
pt Configure isort to use the Black profile, recursively isort the spacy module (#12721) 2023-06-14 17:48:41 +02:00
ro Configure isort to use the Black profile, recursively isort the spacy module (#12721) 2023-06-14 17:48:41 +02:00
ru Configure isort to use the Black profile, recursively isort the spacy module (#12721) 2023-06-14 17:48:41 +02:00
sa Configure isort to use the Black profile, recursively isort the spacy module (#12721) 2023-06-14 17:48:41 +02:00
si Configure isort to use the Black profile, recursively isort the spacy module (#12721) 2023-06-14 17:48:41 +02:00
sk Configure isort to use the Black profile, recursively isort the spacy module (#12721) 2023-06-14 17:48:41 +02:00
sl Configure isort to use the Black profile, recursively isort the spacy module (#12721) 2023-06-14 17:48:41 +02:00
sq Configure isort to use the Black profile, recursively isort the spacy module (#12721) 2023-06-14 17:48:41 +02:00
sr Configure isort to use the Black profile, recursively isort the spacy module (#12721) 2023-06-14 17:48:41 +02:00
sv Configure isort to use the Black profile, recursively isort the spacy module (#12721) 2023-06-14 17:48:41 +02:00
ta Configure isort to use the Black profile, recursively isort the spacy module (#12721) 2023-06-14 17:48:41 +02:00
te Configure isort to use the Black profile, recursively isort the spacy module (#12721) 2023-06-14 17:48:41 +02:00
th Python 3.13 support (#13823) 2025-05-22 13:47:21 +02:00
ti Configure isort to use the Black profile, recursively isort the spacy module (#12721) 2023-06-14 17:48:41 +02:00
tl Configure isort to use the Black profile, recursively isort the spacy module (#12721) 2023-06-14 17:48:41 +02:00
tn Configure isort to use the Black profile, recursively isort the spacy module (#12721) 2023-06-14 17:48:41 +02:00
tr Update examples.py (#12895) 2023-08-11 15:38:06 +02:00
tt Configure isort to use the Black profile, recursively isort the spacy module (#12721) 2023-06-14 17:48:41 +02:00
uk Configure isort to use the Black profile, recursively isort the spacy module (#12721) 2023-06-14 17:48:41 +02:00
ur Configure isort to use the Black profile, recursively isort the spacy module (#12721) 2023-06-14 17:48:41 +02:00
vi Python 3.13 support (#13823) 2025-05-22 13:47:21 +02:00
xx fix: Add missing comma to examples.py (#10167) 2022-01-30 16:43:29 +09:00
yo Configure isort to use the Black profile, recursively isort the spacy module (#12721) 2023-06-14 17:48:41 +02:00
zh Python 3.13 support (#13823) 2025-05-22 13:47:21 +02:00
__init__.py Remove imports in /lang/__init__.py 2017-05-08 23:58:07 +02:00
char_classes.py add punctuation to grc (#11426) 2022-09-27 11:38:56 +02:00
lex_attrs.py Configure isort to use the Black profile, recursively isort the spacy module (#12721) 2023-06-14 17:48:41 +02:00
norm_exceptions.py Tidy up and auto-format 2020-02-18 15:38:18 +01:00
punctuation.py Configure isort to use the Black profile, recursively isort the spacy module (#12721) 2023-06-14 17:48:41 +02:00
tokenizer_exceptions.py Configure isort to use the Black profile, recursively isort the spacy module (#12721) 2023-06-14 17:48:41 +02:00