mirror of
https://github.com/explosion/spaCy.git
synced 2025-08-03 20:00:21 +03:00
Type and doc fixes
This commit is contained in:
parent
72c0f7c798
commit
842bbeae29
|
@ -55,7 +55,7 @@ def distill_cli(
|
||||||
|
|
||||||
|
|
||||||
def distill(
|
def distill(
|
||||||
teacher_model: str,
|
teacher_model: Union[str, Path],
|
||||||
student_config_path: Union[str, Path],
|
student_config_path: Union[str, Path],
|
||||||
output_path: Optional[Union[str, Path]] = None,
|
output_path: Optional[Union[str, Path]] = None,
|
||||||
*,
|
*,
|
||||||
|
|
|
@ -1707,17 +1707,18 @@ $ python -m spacy project dvc [project_dir] [workflow] [--force] [--verbose] [--
|
||||||
Distill a _student_ pipeline from a _teacher_ pipeline. Distillation trains the
|
Distill a _student_ pipeline from a _teacher_ pipeline. Distillation trains the
|
||||||
models in the student pipeline on the activations of the teacher's models. A
|
models in the student pipeline on the activations of the teacher's models. A
|
||||||
typical use case for distillation is to extract a smaller, more performant model
|
typical use case for distillation is to extract a smaller, more performant model
|
||||||
from a larger high-accuracy model. Since distillation uses the activations of the
|
from a larger high-accuracy model. Since distillation uses the activations of
|
||||||
teacher, distillation can be performed on a corpus of raw text without (gold standard)
|
the teacher, distillation can be performed on a corpus of raw text without (gold
|
||||||
annotations.
|
standard) annotations. A development set of gold annotations _is_ needed to
|
||||||
|
evaluate the distilled model on during distillation.
|
||||||
|
|
||||||
`distill` will save out the best performing pipeline across all epochs, as well as the final
|
`distill` will save out the best performing pipeline across all epochs, as well
|
||||||
pipeline. The `--code` argument can be used to provide a Python file that's
|
as the final pipeline. The `--code` argument can be used to provide a Python
|
||||||
imported before the training process starts. This lets you register
|
file that's imported before the training process starts. This lets you register
|
||||||
[custom functions](/usage/training#custom-functions) and architectures and refer
|
[custom functions](/usage/training#custom-functions) and architectures and refer
|
||||||
to them in your config, all while still using spaCy's built-in `train` workflow.
|
to them in your config, all while still using spaCy's built-in `train` workflow.
|
||||||
If you need to manage complex multi-step training workflows, check out the new
|
If you need to manage complex multi-step training workflows, check out the
|
||||||
[spaCy projects](/usage/projects).
|
[Weasel](https://github.com/explosion/weasel).
|
||||||
|
|
||||||
> #### Example
|
> #### Example
|
||||||
>
|
>
|
||||||
|
@ -1731,14 +1732,14 @@ $ python -m spacy distill [teacher_model] [student_config_path] [--output] [--co
|
||||||
|
|
||||||
| Name | Description |
|
| Name | Description |
|
||||||
| --------------------- | ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------ |
|
| --------------------- | ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------ |
|
||||||
| `teacher_model` | The teacher pipeline to distill the student from. ~~Path (positional)~~ |
|
| `teacher_model` | The teacher pipeline (name or path) to distill the student from. ~~Union[str, Path] (positional)~~ |
|
||||||
| `student_config_path` | The configuration of the student pipeline. ~~Path (positional)~~ |
|
| `student_config_path` | The configuration of the student pipeline. ~~Path (positional)~~ |
|
||||||
| `--output`, `-o` | Directory to store the distilled pipeline in. Will be created if it doesn't exist. ~~Optional[Path] \(option)~~ |
|
| `--output`, `-o` | Directory to store the distilled pipeline in. Will be created if it doesn't exist. ~~Optional[Path] \(option)~~ |
|
||||||
| `--code`, `-c` | Comma-separated paths to Python files with additional code to be imported. Allows [registering custom functions](/usage/training#custom-functions) for new architectures. ~~Optional[Path] \(option)~~ |
|
| `--code`, `-c` | Comma-separated paths to Python files with additional code to be imported. Allows [registering custom functions](/usage/training#custom-functions) for new architectures. ~~Optional[Path] \(option)~~ |
|
||||||
| `--verbose`, `-V` | Show more detailed messages during distillation. ~~bool (flag)~~ |
|
| `--verbose`, `-V` | Show more detailed messages during distillation. ~~bool (flag)~~ |
|
||||||
| `--gpu-id`, `-g` | GPU ID or `-1` for CPU. Defaults to `-1`. ~~int (option)~~ |
|
| `--gpu-id`, `-g` | GPU ID or `-1` for CPU. Defaults to `-1`. ~~int (option)~~ |
|
||||||
| `--help`, `-h` | Show help message and available arguments. ~~bool (flag)~~ |
|
| `--help`, `-h` | Show help message and available arguments. ~~bool (flag)~~ |
|
||||||
| **CREATES** | A `dvc.yaml` file in the project directory, based on the steps defined in the given workflow. |
|
| **CREATES** | The final trained pipeline and the best trained pipeline. |
|
||||||
|
|
||||||
## huggingface-hub {id="huggingface-hub",version="3.1"}
|
## huggingface-hub {id="huggingface-hub",version="3.1"}
|
||||||
|
|
||||||
|
|
Loading…
Reference in New Issue
Block a user