spaCy/spacy/ml
Daniël de Kok 5e297aa20e
Add TrainablePipe.{distill,get_teacher_student_loss} (#12016)
* Add `TrainablePipe.{distill,get_teacher_student_loss}`

This change adds two methods:

- `TrainablePipe::distill` which performs a training step of a
   student pipe on a teacher pipe, giving a batch of `Doc`s.
- `TrainablePipe::get_teacher_student_loss` computes the loss
  of a student relative to the teacher.

The `distill` or `get_teacher_student_loss` methods are also implemented
in the tagger, edit tree lemmatizer, and parser pipes, to enable
distillation in those pipes and as an example for other pipes.

* Fix stray `Beam` import

* Fix incorrect import

* Apply suggestions from code review

Co-authored-by: Sofie Van Landeghem <svlandeg@users.noreply.github.com>

* Apply suggestions from code review

Co-authored-by: Sofie Van Landeghem <svlandeg@users.noreply.github.com>

* TrainablePipe.distill: use `Iterable[Example]`

* Add Pipe.is_distillable method

* Add `validate_distillation_examples`

This first calls `validate_examples` and then checks that the
student/teacher tokens are the same.

* Update distill documentation

* Add distill documentation for all pipes that support distillation

* Fix incorrect identifier

* Apply suggestions from code review

Co-authored-by: Sofie Van Landeghem <svlandeg@users.noreply.github.com>

* Add comment to explain `is_distillable`

Co-authored-by: Sofie Van Landeghem <svlandeg@users.noreply.github.com>
2023-01-16 10:25:53 +01:00
..
models Merge remote-tracking branch 'upstream/master' into chore/update-v4-from-master-4 2022-11-03 09:42:36 +01:00
__init__.py Auto-format code with black (#9530) 2021-10-22 13:03:10 +02:00
_precomputable_affine.py Fix compatibility with CuPy 9.x (#11194) 2022-07-26 10:52:01 +02:00
callbacks.py Add TrainablePipe.{distill,get_teacher_student_loss} (#12016) 2023-01-16 10:25:53 +01:00
character_embed.py Make stable private modules public and adjust names (#11353) 2022-08-30 13:56:35 +02:00
extract_ngrams.py 🏷 Add Mypy check to CI and ignore all existing Mypy errors (#9167) 2021-10-14 15:21:40 +02:00
extract_spans.py Fix entity linker batching (#9669) 2022-03-04 09:17:36 +01:00
featureextractor.py Fix import 2020-10-02 01:12:34 +02:00
parser_model.pxd Parser: use C saxpy/sgemm provided by the Ops implementation (#10773) 2022-05-27 11:20:52 +02:00
parser_model.pyx Fix v4 branch to build against Thinc v9 (#11921) 2022-12-17 14:32:19 +01:00
staticvectors.py Fix issues for Mypy 0.950 and Pydantic 1.9.0 (#10786) 2022-05-25 09:33:54 +02:00
tb_framework.py TransitionBasedParser.v1 to legacy (#8586) 2021-07-06 15:26:45 +02:00