spaCy/spacy/cli
Matthew Honnibal bb911e5f4e Fix #3830: 'subtok' label being added even if learn_tokens=False (#4188)
* Prevent subtok label if not learning tokens

The parser introduces the subtok label to mark tokens that should be
merged during post-processing. Previously this happened even if we did
not have the --learn-tokens flag set. This patch passes the config
through to the parser, to prevent the problem.

* Make merge_subtokens a parser post-process if learn_subtokens

* Fix train script

* Add test for 3830: subtok problem

* Fix handlign of non-subtok in parser training
2019-08-23 17:54:00 +02:00
..
converters Replace cytoolz.partition_all with util.minibatch 2019-05-11 21:12:09 +02:00
__init__.py Move UD scripts to bin 2019-03-20 01:19:34 +01:00
_schemas.py Store JSON schemas in Python and tidy up (#3235) 2019-02-07 19:44:31 +11:00
convert.py Change default output format from jsonl to json for cli convert (#3583) (closes #3523) 2019-04-12 11:31:23 +02:00
debug_data.py Tidy up and auto-format 2019-08-18 15:09:16 +02:00
download.py Require downloaded model in pkg_resources (#4090) 2019-08-07 13:18:11 +02:00
evaluate.py Open file as utf-8 (closes #4138) 2019-08-18 13:55:34 +02:00
info.py Small CLI improvements (#3030) 2018-12-08 11:49:43 +01:00
init_model.py Tidy up and auto-format 2019-08-18 15:09:16 +02:00
link.py Small CLI improvements (#3030) 2018-12-08 11:49:43 +01:00
package.py Also support "requirements" in model.json 2019-07-27 13:34:57 +02:00
pretrain.py Fix absolute imports and avoid importing from cli 2019-08-20 15:08:59 +02:00
profile.py Fix cytoolz import cytoolz 2018-12-06 16:04:12 +01:00
train.py Fix #3830: 'subtok' label being added even if learn_tokens=False (#4188) 2019-08-23 17:54:00 +02:00
validate.py Strip out .dev versions in spacy validate [ci skip] 2019-03-17 12:16:53 +01:00