spaCy/spacy/lang/ja
Hiroshi Matsuda 150a39ccca
Japanese model: add user_dict entries and small refactor (#5573)
* user_dict fields: adding inflections, reading_forms, sub_tokens
deleting: unidic_tags
improve code readability around the token alignment procedure

* add test cases, replace fugashi with sudachipy in conftest

* move bunsetu.py to spaCy Universe as a pipeline component BunsetuRecognizer

* tag is space -> both surface and tag are spaces

* consider len(text)==0
2020-06-22 14:32:25 +02:00
..
__init__.py Japanese model: add user_dict entries and small refactor (#5573) 2020-06-22 14:32:25 +02:00
examples.py 💫 Tidy up and auto-format .py files (#2983) 2018-11-30 17:03:03 +01:00
stop_words.py 💫 Tidy up and auto-format .py files (#2983) 2018-11-30 17:03:03 +01:00
syntax_iterators.py Add Japanese Model (#5544) 2020-06-04 19:15:43 +02:00
tag_bigram_map.py Add Japanese Model (#5544) 2020-06-04 19:15:43 +02:00
tag_map.py Add Japanese Model (#5544) 2020-06-04 19:15:43 +02:00
tag_orth_map.py Add Japanese Model (#5544) 2020-06-04 19:15:43 +02:00