spaCy

explosion/spaCy

Fork 0

mirror of https://github.com/explosion/spaCy.git synced 2024-11-13 13:17:06 +03:00

Commit Graph

Author	SHA1	Message	Date
Adriane Boyd	1d59fdbd39	Update Vietnamese tokenizer (#8099 ) * Adapt tokenization methods from `pyvi` to preserve text encoding and whitespace * Add serialization support similar to Chinese and Japanese Note: as for Chinese and Japanese, some settings are duplicated in `config.cfg` and `tokenizer/cfg`.	2021-05-17 18:16:20 +10:00

Author

SHA1

Message

Date

Adriane Boyd

1d59fdbd39

Update Vietnamese tokenizer (#8099 )

* Adapt tokenization methods from `pyvi` to preserve text encoding and
whitespace
* Add serialization support similar to Chinese and Japanese

Note: as for Chinese and Japanese, some settings are duplicated in
`config.cfg` and `tokenizer/cfg`.

2021-05-17 18:16:20 +10:00

1 Commits