spaCy/spacy/gold
Matthew Honnibal ecb3c4e8f4
Create corpus iterator and batcher from registry during training (#5865)
* Move batchers into their own module (and registry)

* Update CLI

* Update Corpus and batcher

* Update tests

* Update one config

* Merge 'evaluation' block back under [training]

* Import batchers in gold __init__

* Fix batchers

* Update config

* Update schema

* Update util

* Don't assume train and dev are actually paths

* Update onto-joint config

* Fix missing import

* Format

* Format

* Update spacy/gold/corpus.py

Co-authored-by: Ines Montani <ines@ines.io>

* Fix name

* Update default config

* Fix get_length option in batchers

* Update test

* Add comment

* Pass path into Corpus

* Update docstring

* Update schema and configs

* Update config

* Fix test

* Fix paths

* Fix print

* Fix create_train_batches

* [training.read_train] -> [training.train_corpus]

* Update onto-joint config

Co-authored-by: Ines Montani <ines@ines.io>
2020-08-04 15:09:37 +02:00
..
converters Refactor pipeline components, config and language data (#5759) 2020-07-22 13:42:59 +02:00
__init__.pxd Improve spacy.gold (no GoldParse, no json format!) (#5555) 2020-06-26 19:34:12 +02:00
__init__.py Create corpus iterator and batcher from registry during training (#5865) 2020-08-04 15:09:37 +02:00
align.py Tidy up, autoformat, add types 2020-07-25 15:01:15 +02:00
augment.py Fix typo 2020-07-22 20:27:22 +02:00
batchers.py Create corpus iterator and batcher from registry during training (#5865) 2020-08-04 15:09:37 +02:00
corpus.py Create corpus iterator and batcher from registry during training (#5865) 2020-08-04 15:09:37 +02:00
example.pxd Recalculate alignment if tokenization differs (#5868) 2020-08-04 14:31:32 +02:00
example.pyx Recalculate alignment if tokenization differs (#5868) 2020-08-04 14:31:32 +02:00
gold_io.pyx Make docs_to_json backwards-compatible with v2 (#5714) 2020-07-06 14:15:00 +02:00
iob_utils.py Auto-format 2020-07-04 16:25:34 +02:00