spaCy

mirror of https://github.com/explosion/spaCy.git synced 2025-04-18 16:11:58 +03:00

History

Adriane Boyd 1d59fdbd39 Update Vietnamese tokenizer (#8099 ) * Adapt tokenization methods from `pyvi` to preserve text encoding and whitespace * Add serialization support similar to Chinese and Japanese Note: as for Chinese and Japanese, some settings are duplicated in `config.cfg` and `tokenizer/cfg`.		2021-05-17 18:16:20 +10:00
..
__init__.py	Update Vietnamese tokenizer (#8099 )	2021-05-17 18:16:20 +10:00
lex_attrs.py	Drop Python 2.7 and 3.5 (#4828 )	2019-12-22 01:53:56 +01:00
stop_words.py	Drop Python 2.7 and 3.5 (#4828 )	2019-12-22 01:53:56 +01:00