spaCy/website/docs/usage
Adriane Boyd 39ebcd9ec9
Refactor Chinese tokenizer configuration (#5736)
* Refactor Chinese tokenizer configuration

Refactor `ChineseTokenizer` configuration so that it uses a single
`segmenter` setting to choose between character segmentation, jieba, and
pkuseg.

* replace `use_jieba`, `use_pkuseg`, `require_pkuseg` with the setting
`segmenter` with the supported values: `char`, `jieba`, `pkuseg`
* make the default segmenter plain character segmentation `char` (no
additional libraries required)

* Fix Chinese serialization test to use char default

* Warn if attempting to customize other segmenter

Add a warning if `Chinese.pkuseg_update_user_dict` is called when
another segmenter is selected.
2020-07-19 13:34:37 +02:00
..
101 fix component constructors, update, begin_training, reference to GoldParse 2020-07-07 19:17:19 +02:00
_benchmarks-choi.md 💫 Update website (#3285) 2019-02-17 19:31:19 +01:00
facts-figures.md Remove old sections [ci skip] (closes #4961) 2020-02-03 13:10:46 +01:00
index.md Update v3 docs [ci skip] 2020-07-05 16:11:16 +02:00
linguistic-features.md Remove object subclassing 2020-07-12 14:03:23 +02:00
models.md Refactor Chinese tokenizer configuration (#5736) 2020-07-19 13:34:37 +02:00
processing-pipelines.md Remove object subclassing 2020-07-12 14:03:23 +02:00
projects.md Update projects.md [ci skip] 2020-07-10 22:41:27 +02:00
rule-based-matching.md Remove object subclassing 2020-07-12 14:03:23 +02:00
saving-loading.md Remove object subclassing 2020-07-12 14:03:23 +02:00
spacy-101.md fix component constructors, update, begin_training, reference to GoldParse 2020-07-07 19:17:19 +02:00
training.md Update docs 2020-07-12 12:32:28 +02:00
v2-1.md Remove u-strings and fix formatting [ci skip] 2019-09-12 16:11:15 +02:00
v2-2.md Update v3 docs [ci skip] 2020-07-05 16:11:16 +02:00
v2-3.md Extend v2.3 migration guide (#5653) 2020-06-26 14:13:01 +02:00
v2.md Update v3 docs [ci skip] 2020-07-05 16:11:16 +02:00
v3.md Add new in v3.0 2020-07-01 13:02:17 +02:00
vectors-embeddings.md Update WIP 2020-07-06 22:22:37 +02:00
visualizers.md Update projects docs etc. 2020-07-09 19:43:25 +02:00