mirror of
https://github.com/explosion/spaCy.git
synced 2024-12-25 17:36:30 +03:00
1d59fdbd39
* Adapt tokenization methods from `pyvi` to preserve text encoding and whitespace * Add serialization support similar to Chinese and Japanese Note: as for Chinese and Japanese, some settings are duplicated in `config.cfg` and `tokenizer/cfg`.
0 lines
Python
0 lines
Python