spaCy/spacy/ml
Daniël de Kok e27c60a702
Reimplement distillation with oracle cut size (#12214)
* Improve the correctness of _parse_patch

* If there are no more actions, do not attempt to make further
  transitions, even if not all states are final.
* Assert that the number of actions for a step is the same as
  the number of states.

* Reimplement distillation with oracle cut size

The code for distillation with an oracle cut size was not reimplemented
after the parser refactor. We did not notice, because we did not have
tests for this functionality. This change brings back the functionality
and adds this to the parser tests.

* Rename states2actions to _states_to_actions for consistency

* Test distillation max cuts in NER

* Mark parser/NER tests as slow

* Typo

* Fix invariant in _states_diff_to_actions

* Rename _init_batch -> _init_batch_from_teacher

* Ninja edit the ninja edit

* Check that we raise an exception when we pass the incorrect number or actions

* Remove unnecessary get

Co-authored-by: Madeesh Kannan <shadeMe@users.noreply.github.com>

* Write out condition more explicitly

---------

Co-authored-by: Madeesh Kannan <shadeMe@users.noreply.github.com>
2023-02-21 15:47:18 +01:00
..
models Drop python 3.6/3.7, remove unneeded compat (#12187) 2023-01-27 15:48:20 +01:00
__init__.py Auto-format code with black (#9530) 2021-10-22 13:03:10 +02:00
callbacks.py Add TrainablePipe.{distill,get_teacher_student_loss} (#12016) 2023-01-16 10:25:53 +01:00
character_embed.py Make stable private modules public and adjust names (#11353) 2022-08-30 13:56:35 +02:00
extract_ngrams.py 🏷 Add Mypy check to CI and ignore all existing Mypy errors (#9167) 2021-10-14 15:21:40 +02:00
extract_spans.py Fix entity linker batching (#9669) 2022-03-04 09:17:36 +01:00
featureextractor.py Fix import 2020-10-02 01:12:34 +02:00
staticvectors.py Fix issues for Mypy 0.950 and Pydantic 1.9.0 (#10786) 2022-05-25 09:33:54 +02:00
tb_framework.pxd Merge the parser refactor into v4 (#10940) 2023-01-18 11:27:45 +01:00
tb_framework.pyx Reimplement distillation with oracle cut size (#12214) 2023-02-21 15:47:18 +01:00