spaCy/spacy/tests/training
Daniël de Kok b052b1b47f
Fix batching regression (#12094)
* Fix batching regression

Some time ago, the spaCy v4 branch switched to the new Thinc v9
schedule. However, this introduced an error in how batching is handed.

In the PR, the batchers were changed to keep track of their step,
so that the step can be passed to the schedule. However, the issue
is that the training loop repeatedly calls the batching functions
(rather than using an infinite generator/iterator). So, the step and
therefore the schedule would be reset each epoch. Before the schedule
switch we didn't have this issue, because the old schedules were
stateful.

This PR fixes this issue by reverting the batching functions to use
a (stateful) generator. Their registry functions do accept a `Schedule`
and we convert `Schedule`s to generators.

* Update batcher docs

* Docstring fixes

* Make minibatch take iterables again as well

* Bump thinc requirement to 9.0.0.dev2

* Use type declaration

* Convert another comment into a proper type declaration
2023-01-18 18:28:30 +01:00
..
__init__.py move tests to correct subdir 2020-09-15 21:40:38 +02:00
test_augmenters.py Preserve missing entity annotation in augmenters (#11540) 2022-09-27 10:16:51 +02:00
test_logger.py Add ConsoleLogger.v2 (#11214) 2022-08-29 10:23:05 +02:00
test_new_example.py adding spans to doc_annotation in Example.to_dict (#11261) 2022-08-05 12:26:38 +02:00
test_pretraining.py Tagger: use unnormalized probabilities for inference (#10197) 2022-03-15 14:15:31 +01:00
test_readers.py Revert "disable failing test because Stanford servers are down (#11015)" (#11054) 2022-06-30 11:24:54 +02:00
test_rehearse.py bugfix parser labels (#10797) 2022-05-13 11:41:32 +02:00
test_training.py Fix batching regression (#12094) 2023-01-18 18:28:30 +01:00