Fixes from Sofie

Co-authored-by: Sofie Van Landeghem <svlandeg@users.noreply.github.com>
2025-09-13 15:42:35 +03:00 · 2024-04-11 15:27:32 +02:00 · 2024-04-11 15:27:32 +02:00 · 72c0f7c798
commit 72c0f7c798
parent 10b935347a
2 changed files with 7 additions and 7 deletions
--- a/spacy/cli/distill.py
+++ b/spacy/cli/distill.py
@ -64,7 +64,7 @@ def distill(
 ):
    student_config_path = util.ensure_path(student_config_path)
    output_path = util.ensure_path(output_path)
-    # Make sure all files and paths exists if they are needed
+    # Make sure all files and paths exist if they are needed
    if not student_config_path or (
        str(student_config_path) != "-" and not student_config_path.exists()
    ):
@ -82,12 +82,12 @@ def distill(
        config = util.load_config(
            student_config_path, overrides=overrides, interpolate=False
        )
-    msg.divider("Initializing pipeline")
+    msg.divider("Initializing student pipeline")
    with show_validation_error(student_config_path, hint_fill=False):
        student = init_nlp_student(config, teacher, use_gpu=use_gpu)

-    msg.good("Initialized pipeline")
-    msg.divider("Distilling pipeline")
+    msg.good("Initialized student pipeline")
+    msg.divider("Distilling student pipeline from teacher")
    distill_nlp(
        teacher,
        student,
--- a/website/docs/api/cli.mdx
+++ b/website/docs/api/cli.mdx
@ -1707,11 +1707,11 @@ $ python -m spacy project dvc [project_dir] [workflow] [--force] [--verbose] [--
 Distill a _student_ pipeline from a _teacher_ pipeline. Distillation trains the
 models in the student pipeline on the activations of the teacher's models. A
 typical use case for distillation is to extract a smaller, more performant model
-from large high-accuracy model. Since distillation uses the activations of the
-teacher, distillation can be performed on a corpus without (gold standard)
+from a larger high-accuracy model. Since distillation uses the activations of the
+teacher, distillation can be performed on a corpus of raw text without (gold standard)
 annotations.

-`distill` will save out the best model from all epochs, as well as the final
+`distill` will save out the best performing pipeline across all epochs, as well as the final
 pipeline. The `--code` argument can be used to provide a Python file that's
 imported before the training process starts. This lets you register
 [custom functions](/usage/training#custom-functions) and architectures and refer