spaCy

mirror of https://github.com/explosion/spaCy.git synced 2025-02-23 06:50:32 +03:00

History

Hiroshi Matsuda 150a39ccca Japanese model: add user_dict entries and small refactor (#5573 ) * user_dict fields: adding inflections, reading_forms, sub_tokens deleting: unidic_tags improve code readability around the token alignment procedure * add test cases, replace fugashi with sudachipy in conftest * move bunsetu.py to spaCy Universe as a pipeline component BunsetuRecognizer * tag is space -> both surface and tag are spaces * consider len(text)==0		2020-06-22 14:32:25 +02:00
..
__init__.py	Revert #4334	2019-09-29 17:32:12 +02:00
test_lemmatization.py	Add Japanese Model (#5544 )	2020-06-04 19:15:43 +02:00
test_serialize.py	Update Japanese tokenizer config and add serialization (#5562 )	2020-06-08 16:29:05 +02:00
test_tokenizer.py	Japanese model: add user_dict entries and small refactor (#5573 )	2020-06-22 14:32:25 +02:00