spaCy

mirror of https://github.com/explosion/spaCy.git synced 2026-02-02 21:46:24 +03:00

History

Daniël de Kok 5e297aa20e Add `TrainablePipe.{distill,get_teacher_student_loss}` (#12016 ) * Add `TrainablePipe.{distill,get_teacher_student_loss}` This change adds two methods: - `TrainablePipe::distill` which performs a training step of a student pipe on a teacher pipe, giving a batch of `Doc`s. - `TrainablePipe::get_teacher_student_loss` computes the loss of a student relative to the teacher. The `distill` or `get_teacher_student_loss` methods are also implemented in the tagger, edit tree lemmatizer, and parser pipes, to enable distillation in those pipes and as an example for other pipes. * Fix stray `Beam` import * Fix incorrect import * Apply suggestions from code review Co-authored-by: Sofie Van Landeghem <svlandeg@users.noreply.github.com> * Apply suggestions from code review Co-authored-by: Sofie Van Landeghem <svlandeg@users.noreply.github.com> * TrainablePipe.distill: use `Iterable[Example]` * Add Pipe.is_distillable method * Add `validate_distillation_examples` This first calls `validate_examples` and then checks that the student/teacher tokens are the same. * Update distill documentation * Add distill documentation for all pipes that support distillation * Fix incorrect identifier * Apply suggestions from code review Co-authored-by: Sofie Van Landeghem <svlandeg@users.noreply.github.com> * Add comment to explain `is_distillable` Co-authored-by: Sofie Van Landeghem <svlandeg@users.noreply.github.com>		2023-01-16 10:25:53 +01:00
..
converters	Auto-format code with black (#10377 )	2022-02-25 10:00:21 +01:00
__init__.pxd	Renaming gold & annotation_setter (#6042 )	2020-09-09 10:31:03 +02:00
__init__.py	Add `TrainablePipe.{distill,get_teacher_student_loss}` (#12016 )	2023-01-16 10:25:53 +01:00
align.pyx	Fix alignment for 1-to-1 tokens and lowercasing (#6476 )	2020-12-08 14:25:16 +08:00
alignment_array.pxd	Alignment: use a simplified ragged type for performance (#10319 )	2022-04-01 09:02:06 +02:00
alignment_array.pyx	Backport parser/alignment optimizations from `feature/refactor-parser` (#10952 )	2022-06-24 13:39:52 +02:00
alignment.py	Alignment: use a simplified ragged type for performance (#10319 )	2022-04-01 09:02:06 +02:00
augment.py	Preserve missing entity annotation in augmenters (#11540 )	2022-09-27 10:16:51 +02:00
batchers.py	Adjust to new `Schedule` class and pass scores to `Optimizer` (#12008 )	2022-12-29 08:03:24 +01:00
callbacks.py	🏷 Add Mypy check to CI and ignore all existing Mypy errors (#9167 )	2021-10-14 15:21:40 +02:00
corpus.py	Auto-format code with black (#9664 )	2021-11-12 10:00:03 +01:00
example.pxd	Make a pre-check to speed up alignment cache (#6139 )	2020-09-24 18:13:39 +02:00
example.pyx	Add `TrainablePipe.{distill,get_teacher_student_loss}` (#12016 )	2023-01-16 10:25:53 +01:00
gold_io.pyx	Fix is_sent_start when converting from JSON (fix #7635 ) (#7655 )	2021-04-08 18:24:52 +10:00
initialize.py	Clean up warnings in the test suite (#11331 )	2022-08-22 12:04:30 +02:00
iob_utils.py	Preserve missing entity annotation in augmenters (#11540 )	2022-09-27 10:16:51 +02:00
loggers.py	New console logger with expanded progress tracking (#11972 )	2022-12-23 15:21:44 +01:00
loop.py	Pass `step=0` to `Schedule` class to yield initial learning rate (#12078 )	2023-01-09 20:15:02 +01:00
pretrain.py	Clarify how to fill in init_tok2vec after pretraining (#9639 )	2021-11-18 15:38:30 +01:00