Update distill documentation

This commit is contained in:
Daniël de Kok 2023-01-13 10:47:34 +01:00
parent d54cc5245a
commit 44498e651a
3 changed files with 25 additions and 16 deletions

View File

@ -70,8 +70,9 @@ cdef class TrainablePipe(Pipe):
teacher_pipe (Optional[TrainablePipe]): The teacher pipe to learn teacher_pipe (Optional[TrainablePipe]): The teacher pipe to learn
from. from.
examples (Iterable[Example]): Distillation examples. The reference examples (Iterable[Example]): Distillation examples. The eference
must contain teacher annotations (if any). and predicted docs must have the same number of tokens and the
same orthography.
drop (float): dropout rate. drop (float): dropout rate.
sgd (Optional[Optimizer]): An optimizer. Will be created via sgd (Optional[Optimizer]): An optimizer. Will be created via
create_optimizer if not set. create_optimizer if not set.

View File

@ -221,8 +221,10 @@ cdef class Parser(TrainablePipe):
teacher_pipe (Optional[TrainablePipe]): The teacher pipe to learn teacher_pipe (Optional[TrainablePipe]): The teacher pipe to learn
from. from.
examples (Iterable[Example]): Distillation examples. The reference examples (Iterable[Example]): Distillation examples. The eference
must contain teacher annotations (if any). and predicted docs must have the same number of tokens and the
same orthography.
drop (float): dropout rate.
sgd (Optional[Optimizer]): An optimizer. Will be created via sgd (Optional[Optimizer]): An optimizer. Will be created via
create_optimizer if not set. create_optimizer if not set.
losses (Optional[Dict[str, float]]): Optional record of loss during losses (Optional[Dict[str, float]]): Optional record of loss during

View File

@ -239,7 +239,14 @@ predictions and gold-standard annotations, and update the component's model.
Train a pipe (the student) on the predictions of another pipe (the teacher). The Train a pipe (the student) on the predictions of another pipe (the teacher). The
student is typically trained on the probability distribution of the teacher, but student is typically trained on the probability distribution of the teacher, but
details may differ per pipe. The goal of distillation is to transfer knowledge details may differ per pipe. The goal of distillation is to transfer knowledge
from the teacher to the student. This feature is experimental. from the teacher to the student.
The distillation is performed on ~~Example~~ objects. The `Example.reference`
and `Example.predicted` ~~Doc~~s must have the same number of tokens and the
same orthography. Even though the reference does not need have to have gold
annotations, the teacher could adds its own annotations when necessary.
This feature is experimental.
> #### Example > #### Example
> >
@ -247,19 +254,18 @@ from the teacher to the student. This feature is experimental.
> teacher_pipe = teacher.add_pipe("your_custom_pipe") > teacher_pipe = teacher.add_pipe("your_custom_pipe")
> student_pipe = student.add_pipe("your_custom_pipe") > student_pipe = student.add_pipe("your_custom_pipe")
> optimizer = nlp.resume_training() > optimizer = nlp.resume_training()
> losses = student.distill(teacher_pipe, teacher_docs, student_docs, sgd=optimizer) > losses = student.distill(teacher_pipe, examples, sgd=optimizer)
> ``` > ```
| Name | Description | | Name | Description |
| -------------- | -------------------------------------------------------------------------------------------------------------------------------------------- | | -------------- | ------------------------------------------------------------------------------------------------------------------------------------------- |
| `teacher_pipe` | The teacher pipe to learn from. ~~Optional[TrainablePipe]~~ | | `teacher_pipe` | The teacher pipe to learn from. ~~Optional[TrainablePipe]~~ |
| `teacher_docs` | Documents passed through teacher pipes. ~~Iterable[Doc]~~ | | `examples` | Distillation examples. The reference and predicted docs must have the same number of tokens and the same orthography. ~~Iterable[Example]~~ |
| `student_docs` | Documents passed through student pipes. Must contain the same tokens as `teacher_docs` but may have different annotations. ~~Iterable[Doc]~~ | | _keyword-only_ | |
| _keyword-only_ | | | `drop` | Dropout rate. ~~float~~ |
| `drop` | Dropout rate. ~~float~~ | | `sgd` | An optimizer. Will be created via [`create_optimizer`](#create_optimizer) if not set. ~~Optional[Optimizer]~~ |
| `sgd` | An optimizer. Will be created via [`create_optimizer`](#create_optimizer) if not set. ~~Optional[Optimizer]~~ | | `losses` | Optional record of the loss during distillation. Updated using the component name as the key. ~~Optional[Dict[str, float]]~~ |
| `losses` | Optional record of the loss during distillation. Updated using the component name as the key. ~~Optional[Dict[str, float]]~~ | | **RETURNS** | The updated `losses` dictionary. ~~Dict[str, float]~~ |
| **RETURNS** | The updated `losses` dictionary. ~~Dict[str, float]~~ |
## TrainablePipe.rehearse {id="rehearse",tag="method,experimental",version="3"} ## TrainablePipe.rehearse {id="rehearse",tag="method,experimental",version="3"}