spaCy/spacy/tests/pipeline
Daniël de Kok b052b1b47f
Fix batching regression (#12094)
* Fix batching regression

Some time ago, the spaCy v4 branch switched to the new Thinc v9
schedule. However, this introduced an error in how batching is handed.

In the PR, the batchers were changed to keep track of their step,
so that the step can be passed to the schedule. However, the issue
is that the training loop repeatedly calls the batching functions
(rather than using an infinite generator/iterator). So, the step and
therefore the schedule would be reset each epoch. Before the schedule
switch we didn't have this issue, because the old schedules were
stateful.

This PR fixes this issue by reverting the batching functions to use
a (stateful) generator. Their registry functions do accept a `Schedule`
and we convert `Schedule`s to generators.

* Update batcher docs

* Docstring fixes

* Make minibatch take iterables again as well

* Bump thinc requirement to 9.0.0.dev2

* Use type declaration

* Convert another comment into a proper type declaration
2023-01-18 18:28:30 +01:00
..
__init__.py Revert #4334 2019-09-29 17:32:12 +02:00
test_analysis.py Simplify pipe analysis 2020-08-01 13:40:06 +02:00
test_annotates_on_update.py Tidy up and auto-format 2021-07-18 15:44:56 +10:00
test_attributeruler.py Refactor scoring methods to use registered functions (#8766) 2021-08-10 15:13:39 +02:00
test_edit_tree_lemmatizer.py Add TrainablePipe.{distill,get_teacher_student_loss} (#12016) 2023-01-16 10:25:53 +01:00
test_entity_linker.py Merge remote-tracking branch 'upstream/master' into chore/update-v4-from-master-4 2022-11-03 09:42:36 +01:00
test_entity_ruler.py update tests from master to follow v4 principles (2) 2023-01-11 19:04:06 +01:00
test_functions.py Add doc_cleaner component (#9659) 2021-11-23 15:33:33 +01:00
test_initialize.py Test with default value 2020-09-29 17:00:40 +02:00
test_lemmatizer.py Tidy up and auto-format 2021-07-18 15:44:56 +10:00
test_models.py Make stable private modules public and adjust names (#11353) 2022-08-30 13:56:35 +02:00
test_morphologizer.py Add TrainablePipe.{distill,get_teacher_student_loss} (#12016) 2023-01-16 10:25:53 +01:00
test_pipe_factories.py Auto-format code with black (#10795) 2022-05-13 19:02:08 +02:00
test_pipe_methods.py Remove all references to "begin_training" (#11943) 2022-12-08 11:43:52 +01:00
test_sentencizer.py Refactor Docs.is_ flags (#6044) 2020-09-17 00:14:01 +02:00
test_senter.py Add TrainablePipe.{distill,get_teacher_student_loss} (#12016) 2023-01-16 10:25:53 +01:00
test_span_ruler.py Add SpanRuler component (#9880) 2022-06-02 13:12:53 +02:00
test_spancat.py Merge branch 'copy_master' into copy_v4 2022-12-05 08:56:15 +01:00
test_tagger.py Fix batching regression (#12094) 2023-01-18 18:28:30 +01:00
test_textcat.py Fix batching regression (#12094) 2023-01-18 18:28:30 +01:00
test_tok2vec.py Merge the parser refactor into v4 (#10940) 2023-01-18 11:27:45 +01:00