mirror of
https://github.com/explosion/spaCy.git
synced 2025-08-04 04:10:20 +03:00
Add distill documentation for all pipes that support distillation
This commit is contained in:
parent
44498e651a
commit
2774667f77
|
@ -131,6 +131,39 @@ and all pipeline components are applied to the `Doc` in order. Both
|
||||||
| `doc` | The document to process. ~~Doc~~ |
|
| `doc` | The document to process. ~~Doc~~ |
|
||||||
| **RETURNS** | The processed document. ~~Doc~~ |
|
| **RETURNS** | The processed document. ~~Doc~~ |
|
||||||
|
|
||||||
|
## DependencyParser.distill {id="distill", tag="method,experimental", version="4"}
|
||||||
|
|
||||||
|
Train a pipe (the student) on the predictions of another pipe (the teacher). The
|
||||||
|
student is typically trained on the probability distribution of the teacher, but
|
||||||
|
details may differ per pipe. The goal of distillation is to transfer knowledge
|
||||||
|
from the teacher to the student.
|
||||||
|
|
||||||
|
The distillation is performed on ~~Example~~ objects. The `Example.reference`
|
||||||
|
and `Example.predicted` ~~Doc~~s must have the same number of tokens and the
|
||||||
|
same orthography. Even though the reference does not need have to have gold
|
||||||
|
annotations, the teacher could adds its own annotations when necessary.
|
||||||
|
|
||||||
|
This feature is experimental.
|
||||||
|
|
||||||
|
> #### Example
|
||||||
|
>
|
||||||
|
> ```python
|
||||||
|
> teacher_pipe = teacher.add_pipe("parser")
|
||||||
|
> student_pipe = student.add_pipe("parser")
|
||||||
|
> optimizer = nlp.resume_training()
|
||||||
|
> losses = student.distill(teacher_pipe, examples, sgd=optimizer)
|
||||||
|
> ```
|
||||||
|
|
||||||
|
| Name | Description |
|
||||||
|
| -------------- | ------------------------------------------------------------------------------------------------------------------------------------------- |
|
||||||
|
| `teacher_pipe` | The teacher pipe to learn from. ~~Optional[TrainablePipe]~~ |
|
||||||
|
| `examples` | Distillation examples. The reference and predicted docs must have the same number of tokens and the same orthography. ~~Iterable[Example]~~ |
|
||||||
|
| _keyword-only_ | |
|
||||||
|
| `drop` | Dropout rate. ~~float~~ |
|
||||||
|
| `sgd` | An optimizer. Will be created via [`create_optimizer`](#create_optimizer) if not set. ~~Optional[Optimizer]~~ |
|
||||||
|
| `losses` | Optional record of the loss during distillation. Updated using the component name as the key. ~~Optional[Dict[str, float]]~~ |
|
||||||
|
| **RETURNS** | The updated `losses` dictionary. ~~Dict[str, float]~~ |
|
||||||
|
|
||||||
## DependencyParser.pipe {id="pipe",tag="method"}
|
## DependencyParser.pipe {id="pipe",tag="method"}
|
||||||
|
|
||||||
Apply the pipe to a stream of documents. This usually happens under the hood
|
Apply the pipe to a stream of documents. This usually happens under the hood
|
||||||
|
|
|
@ -115,6 +115,39 @@ and all pipeline components are applied to the `Doc` in order. Both
|
||||||
| `doc` | The document to process. ~~Doc~~ |
|
| `doc` | The document to process. ~~Doc~~ |
|
||||||
| **RETURNS** | The processed document. ~~Doc~~ |
|
| **RETURNS** | The processed document. ~~Doc~~ |
|
||||||
|
|
||||||
|
## EditTreeLemmatizer.distill {id="distill", tag="method,experimental", version="4"}
|
||||||
|
|
||||||
|
Train a pipe (the student) on the predictions of another pipe (the teacher). The
|
||||||
|
student is typically trained on the probability distribution of the teacher, but
|
||||||
|
details may differ per pipe. The goal of distillation is to transfer knowledge
|
||||||
|
from the teacher to the student.
|
||||||
|
|
||||||
|
The distillation is performed on ~~Example~~ objects. The `Example.reference`
|
||||||
|
and `Example.predicted` ~~Doc~~s must have the same number of tokens and the
|
||||||
|
same orthography. Even though the reference does not need have to have gold
|
||||||
|
annotations, the teacher could adds its own annotations when necessary.
|
||||||
|
|
||||||
|
This feature is experimental.
|
||||||
|
|
||||||
|
> #### Example
|
||||||
|
>
|
||||||
|
> ```python
|
||||||
|
> teacher_pipe = teacher.add_pipe("trainable_lemmatizer")
|
||||||
|
> student_pipe = student.add_pipe("trainable_lemmatizer")
|
||||||
|
> optimizer = nlp.resume_training()
|
||||||
|
> losses = student.distill(teacher_pipe, examples, sgd=optimizer)
|
||||||
|
> ```
|
||||||
|
|
||||||
|
| Name | Description |
|
||||||
|
| -------------- | ------------------------------------------------------------------------------------------------------------------------------------------- |
|
||||||
|
| `teacher_pipe` | The teacher pipe to learn from. ~~Optional[TrainablePipe]~~ |
|
||||||
|
| `examples` | Distillation examples. The reference and predicted docs must have the same number of tokens and the same orthography. ~~Iterable[Example]~~ |
|
||||||
|
| _keyword-only_ | |
|
||||||
|
| `drop` | Dropout rate. ~~float~~ |
|
||||||
|
| `sgd` | An optimizer. Will be created via [`create_optimizer`](#create_optimizer) if not set. ~~Optional[Optimizer]~~ |
|
||||||
|
| `losses` | Optional record of the loss during distillation. Updated using the component name as the key. ~~Optional[Dict[str, float]]~~ |
|
||||||
|
| **RETURNS** | The updated `losses` dictionary. ~~Dict[str, float]~~ |
|
||||||
|
|
||||||
## EditTreeLemmatizer.pipe {id="pipe",tag="method"}
|
## EditTreeLemmatizer.pipe {id="pipe",tag="method"}
|
||||||
|
|
||||||
Apply the pipe to a stream of documents. This usually happens under the hood
|
Apply the pipe to a stream of documents. This usually happens under the hood
|
||||||
|
|
|
@ -127,6 +127,39 @@ and all pipeline components are applied to the `Doc` in order. Both
|
||||||
| `doc` | The document to process. ~~Doc~~ |
|
| `doc` | The document to process. ~~Doc~~ |
|
||||||
| **RETURNS** | The processed document. ~~Doc~~ |
|
| **RETURNS** | The processed document. ~~Doc~~ |
|
||||||
|
|
||||||
|
## EntityRecognizer.distill {id="distill", tag="method,experimental", version="4"}
|
||||||
|
|
||||||
|
Train a pipe (the student) on the predictions of another pipe (the teacher). The
|
||||||
|
student is typically trained on the probability distribution of the teacher, but
|
||||||
|
details may differ per pipe. The goal of distillation is to transfer knowledge
|
||||||
|
from the teacher to the student.
|
||||||
|
|
||||||
|
The distillation is performed on ~~Example~~ objects. The `Example.reference`
|
||||||
|
and `Example.predicted` ~~Doc~~s must have the same number of tokens and the
|
||||||
|
same orthography. Even though the reference does not need have to have gold
|
||||||
|
annotations, the teacher could adds its own annotations when necessary.
|
||||||
|
|
||||||
|
This feature is experimental.
|
||||||
|
|
||||||
|
> #### Example
|
||||||
|
>
|
||||||
|
> ```python
|
||||||
|
> teacher_pipe = teacher.add_pipe("ner")
|
||||||
|
> student_pipe = student.add_pipe("ner")
|
||||||
|
> optimizer = nlp.resume_training()
|
||||||
|
> losses = student.distill(teacher_pipe, examples, sgd=optimizer)
|
||||||
|
> ```
|
||||||
|
|
||||||
|
| Name | Description |
|
||||||
|
| -------------- | ------------------------------------------------------------------------------------------------------------------------------------------- |
|
||||||
|
| `teacher_pipe` | The teacher pipe to learn from. ~~Optional[TrainablePipe]~~ |
|
||||||
|
| `examples` | Distillation examples. The reference and predicted docs must have the same number of tokens and the same orthography. ~~Iterable[Example]~~ |
|
||||||
|
| _keyword-only_ | |
|
||||||
|
| `drop` | Dropout rate. ~~float~~ |
|
||||||
|
| `sgd` | An optimizer. Will be created via [`create_optimizer`](#create_optimizer) if not set. ~~Optional[Optimizer]~~ |
|
||||||
|
| `losses` | Optional record of the loss during distillation. Updated using the component name as the key. ~~Optional[Dict[str, float]]~~ |
|
||||||
|
| **RETURNS** | The updated `losses` dictionary. ~~Dict[str, float]~~ |
|
||||||
|
|
||||||
## EntityRecognizer.pipe {id="pipe",tag="method"}
|
## EntityRecognizer.pipe {id="pipe",tag="method"}
|
||||||
|
|
||||||
Apply the pipe to a stream of documents. This usually happens under the hood
|
Apply the pipe to a stream of documents. This usually happens under the hood
|
||||||
|
@ -264,6 +297,27 @@ predicted scores.
|
||||||
| `scores` | Scores representing the model's predictions. ~~StateClass~~ |
|
| `scores` | Scores representing the model's predictions. ~~StateClass~~ |
|
||||||
| **RETURNS** | The loss and the gradient, i.e. `(loss, gradient)`. ~~Tuple[float, float]~~ |
|
| **RETURNS** | The loss and the gradient, i.e. `(loss, gradient)`. ~~Tuple[float, float]~~ |
|
||||||
|
|
||||||
|
## EntityRecognizer.get_teacher_student_loss {id="get_teacher_student_loss", tag="method", version="4"}
|
||||||
|
|
||||||
|
Calculate the loss and its gradient for the batch of student scores relative to
|
||||||
|
the teacher scores.
|
||||||
|
|
||||||
|
> #### Example
|
||||||
|
>
|
||||||
|
> ```python
|
||||||
|
> teacher_ner = teacher.get_pipe("ner")
|
||||||
|
> student_ner = student.add_pipe("ner")
|
||||||
|
> student_scores = student_ner.predict([eg.predicted for eg in examples])
|
||||||
|
> teacher_scores = teacher_ner.predict([eg.predicted for eg in examples])
|
||||||
|
> loss, d_loss = student_ner.get_teacher_student_loss(teacher_scores, student_scores)
|
||||||
|
> ```
|
||||||
|
|
||||||
|
| Name | Description |
|
||||||
|
| ---------------- | --------------------------------------------------------------------------- |
|
||||||
|
| `teacher_scores` | Scores representing the teacher model's predictions. |
|
||||||
|
| `student_scores` | Scores representing the student model's predictions. |
|
||||||
|
| **RETURNS** | The loss and the gradient, i.e. `(loss, gradient)`. ~~Tuple[float, float]~~ |
|
||||||
|
|
||||||
## EntityRecognizer.create_optimizer {id="create_optimizer",tag="method"}
|
## EntityRecognizer.create_optimizer {id="create_optimizer",tag="method"}
|
||||||
|
|
||||||
Create an optimizer for the pipeline component.
|
Create an optimizer for the pipeline component.
|
||||||
|
|
|
@ -121,6 +121,39 @@ delegate to the [`predict`](/api/morphologizer#predict) and
|
||||||
| `doc` | The document to process. ~~Doc~~ |
|
| `doc` | The document to process. ~~Doc~~ |
|
||||||
| **RETURNS** | The processed document. ~~Doc~~ |
|
| **RETURNS** | The processed document. ~~Doc~~ |
|
||||||
|
|
||||||
|
## Morphologizer.distill {id="distill", tag="method,experimental", version="4"}
|
||||||
|
|
||||||
|
Train a pipe (the student) on the predictions of another pipe (the teacher). The
|
||||||
|
student is typically trained on the probability distribution of the teacher, but
|
||||||
|
details may differ per pipe. The goal of distillation is to transfer knowledge
|
||||||
|
from the teacher to the student.
|
||||||
|
|
||||||
|
The distillation is performed on ~~Example~~ objects. The `Example.reference`
|
||||||
|
and `Example.predicted` ~~Doc~~s must have the same number of tokens and the
|
||||||
|
same orthography. Even though the reference does not need have to have gold
|
||||||
|
annotations, the teacher could adds its own annotations when necessary.
|
||||||
|
|
||||||
|
This feature is experimental.
|
||||||
|
|
||||||
|
> #### Example
|
||||||
|
>
|
||||||
|
> ```python
|
||||||
|
> teacher_pipe = teacher.add_pipe("morphologizer")
|
||||||
|
> student_pipe = student.add_pipe("morphologizer")
|
||||||
|
> optimizer = nlp.resume_training()
|
||||||
|
> losses = student.distill(teacher_pipe, examples, sgd=optimizer)
|
||||||
|
> ```
|
||||||
|
|
||||||
|
| Name | Description |
|
||||||
|
| -------------- | ------------------------------------------------------------------------------------------------------------------------------------------- |
|
||||||
|
| `teacher_pipe` | The teacher pipe to learn from. ~~Optional[TrainablePipe]~~ |
|
||||||
|
| `examples` | Distillation examples. The reference and predicted docs must have the same number of tokens and the same orthography. ~~Iterable[Example]~~ |
|
||||||
|
| _keyword-only_ | |
|
||||||
|
| `drop` | Dropout rate. ~~float~~ |
|
||||||
|
| `sgd` | An optimizer. Will be created via [`create_optimizer`](#create_optimizer) if not set. ~~Optional[Optimizer]~~ |
|
||||||
|
| `losses` | Optional record of the loss during distillation. Updated using the component name as the key. ~~Optional[Dict[str, float]]~~ |
|
||||||
|
| **RETURNS** | The updated `losses` dictionary. ~~Dict[str, float]~~ |
|
||||||
|
|
||||||
## Morphologizer.pipe {id="pipe",tag="method"}
|
## Morphologizer.pipe {id="pipe",tag="method"}
|
||||||
|
|
||||||
Apply the pipe to a stream of documents. This usually happens under the hood
|
Apply the pipe to a stream of documents. This usually happens under the hood
|
||||||
|
|
|
@ -106,6 +106,39 @@ and all pipeline components are applied to the `Doc` in order. Both
|
||||||
| `doc` | The document to process. ~~Doc~~ |
|
| `doc` | The document to process. ~~Doc~~ |
|
||||||
| **RETURNS** | The processed document. ~~Doc~~ |
|
| **RETURNS** | The processed document. ~~Doc~~ |
|
||||||
|
|
||||||
|
## SentenceRecognizer.distill {id="distill", tag="method,experimental", version="4"}
|
||||||
|
|
||||||
|
Train a pipe (the student) on the predictions of another pipe (the teacher). The
|
||||||
|
student is typically trained on the probability distribution of the teacher, but
|
||||||
|
details may differ per pipe. The goal of distillation is to transfer knowledge
|
||||||
|
from the teacher to the student.
|
||||||
|
|
||||||
|
The distillation is performed on ~~Example~~ objects. The `Example.reference`
|
||||||
|
and `Example.predicted` ~~Doc~~s must have the same number of tokens and the
|
||||||
|
same orthography. Even though the reference does not need have to have gold
|
||||||
|
annotations, the teacher could adds its own annotations when necessary.
|
||||||
|
|
||||||
|
This feature is experimental.
|
||||||
|
|
||||||
|
> #### Example
|
||||||
|
>
|
||||||
|
> ```python
|
||||||
|
> teacher_pipe = teacher.add_pipe("senter")
|
||||||
|
> student_pipe = student.add_pipe("senter")
|
||||||
|
> optimizer = nlp.resume_training()
|
||||||
|
> losses = student.distill(teacher_pipe, examples, sgd=optimizer)
|
||||||
|
> ```
|
||||||
|
|
||||||
|
| Name | Description |
|
||||||
|
| -------------- | ------------------------------------------------------------------------------------------------------------------------------------------- |
|
||||||
|
| `teacher_pipe` | The teacher pipe to learn from. ~~Optional[TrainablePipe]~~ |
|
||||||
|
| `examples` | Distillation examples. The reference and predicted docs must have the same number of tokens and the same orthography. ~~Iterable[Example]~~ |
|
||||||
|
| _keyword-only_ | |
|
||||||
|
| `drop` | Dropout rate. ~~float~~ |
|
||||||
|
| `sgd` | An optimizer. Will be created via [`create_optimizer`](#create_optimizer) if not set. ~~Optional[Optimizer]~~ |
|
||||||
|
| `losses` | Optional record of the loss during distillation. Updated using the component name as the key. ~~Optional[Dict[str, float]]~~ |
|
||||||
|
| **RETURNS** | The updated `losses` dictionary. ~~Dict[str, float]~~ |
|
||||||
|
|
||||||
## SentenceRecognizer.pipe {id="pipe",tag="method"}
|
## SentenceRecognizer.pipe {id="pipe",tag="method"}
|
||||||
|
|
||||||
Apply the pipe to a stream of documents. This usually happens under the hood
|
Apply the pipe to a stream of documents. This usually happens under the hood
|
||||||
|
|
|
@ -105,6 +105,39 @@ and all pipeline components are applied to the `Doc` in order. Both
|
||||||
| `doc` | The document to process. ~~Doc~~ |
|
| `doc` | The document to process. ~~Doc~~ |
|
||||||
| **RETURNS** | The processed document. ~~Doc~~ |
|
| **RETURNS** | The processed document. ~~Doc~~ |
|
||||||
|
|
||||||
|
## Tagger.distill {id="distill", tag="method,experimental", version="4"}
|
||||||
|
|
||||||
|
Train a pipe (the student) on the predictions of another pipe (the teacher). The
|
||||||
|
student is typically trained on the probability distribution of the teacher, but
|
||||||
|
details may differ per pipe. The goal of distillation is to transfer knowledge
|
||||||
|
from the teacher to the student.
|
||||||
|
|
||||||
|
The distillation is performed on ~~Example~~ objects. The `Example.reference`
|
||||||
|
and `Example.predicted` ~~Doc~~s must have the same number of tokens and the
|
||||||
|
same orthography. Even though the reference does not need have to have gold
|
||||||
|
annotations, the teacher could adds its own annotations when necessary.
|
||||||
|
|
||||||
|
This feature is experimental.
|
||||||
|
|
||||||
|
> #### Example
|
||||||
|
>
|
||||||
|
> ```python
|
||||||
|
> teacher_pipe = teacher.add_pipe("tagger")
|
||||||
|
> student_pipe = student.add_pipe("tagger")
|
||||||
|
> optimizer = nlp.resume_training()
|
||||||
|
> losses = student.distill(teacher_pipe, examples, sgd=optimizer)
|
||||||
|
> ```
|
||||||
|
|
||||||
|
| Name | Description |
|
||||||
|
| -------------- | ------------------------------------------------------------------------------------------------------------------------------------------- |
|
||||||
|
| `teacher_pipe` | The teacher pipe to learn from. ~~Optional[TrainablePipe]~~ |
|
||||||
|
| `examples` | Distillation examples. The reference and predicted docs must have the same number of tokens and the same orthography. ~~Iterable[Example]~~ |
|
||||||
|
| _keyword-only_ | |
|
||||||
|
| `drop` | Dropout rate. ~~float~~ |
|
||||||
|
| `sgd` | An optimizer. Will be created via [`create_optimizer`](#create_optimizer) if not set. ~~Optional[Optimizer]~~ |
|
||||||
|
| `losses` | Optional record of the loss during distillation. Updated using the component name as the key. ~~Optional[Dict[str, float]]~~ |
|
||||||
|
| **RETURNS** | The updated `losses` dictionary. ~~Dict[str, float]~~ |
|
||||||
|
|
||||||
## Tagger.pipe {id="pipe",tag="method"}
|
## Tagger.pipe {id="pipe",tag="method"}
|
||||||
|
|
||||||
Apply the pipe to a stream of documents. This usually happens under the hood
|
Apply the pipe to a stream of documents. This usually happens under the hood
|
||||||
|
|
Loading…
Reference in New Issue
Block a user