From ae451d1047d2c72044856248d7a36449da9d101b Mon Sep 17 00:00:00 2001
From: kadarakos <kadar.akos@gmail.com>
Date: Wed, 7 Dec 2022 19:43:18 +0000
Subject: [PATCH] documentation

---
 website/docs/api/cli.md | 31 +++++++++++++++++++++++++++++++
 1 file changed, 31 insertions(+)
diff --git a/website/docs/api/cli.md b/website/docs/api/cli.md
index 8823a3bd8..9d8b8ae83 100644
--- a/website/docs/api/cli.md
+++ b/website/docs/api/cli.md
@@ -12,6 +12,7 @@ menu:
   - ['train', 'train']
   - ['pretrain', 'pretrain']
   - ['evaluate', 'evaluate']
+  - ['apply', 'apply']
   - ['find-threshold', 'find-threshold']
   - ['assemble', 'assemble']
   - ['package', 'package']
@@ -1162,6 +1163,36 @@ $ python -m spacy evaluate [model] [data_path] [--output] [--code] [--gold-prepr
 | `--help`, `-h`                            | Show help message and available arguments. ~~bool (flag)~~                                                                                                                           |
 | **CREATES**                               | Training results and optional metrics and visualizations.                                                                                                                            |
 
+## apply {#apply new="3.5" tag="command"}
+
+Applies a trained pipeline to data and stores the resulting
+annotated documents in a `DocBin`. The input can be a single file
+or a directory. The recognized input formats are:
+
+1. `.spacy`
+2. `.jsonl` containing a user specified `text_key`
+3. Files with any other extension are assumed to be plain text files containing a single document.
+
+When a directory is provided it is traversed recursively to collect all files.
+
+```cli
+$ python -m spacy apply [model] [data-path] [output-file] [--code] [--text-key] [--force-overwrite] [--gpu-id] [--batch-size] [--n-process]
+```
+
+| Name                                      | Description                                                                                                                                                                          |
+| ----------------------------------------- | ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------ |
+| `model`                                   | Pipeline to apply to the data. Can be a package or a path to a data directory. ~~str (positional)~~                                                                                           |
+| `data_path`                               | Location of evaluation data in spaCy's [binary format](/api/data-formats#training). ~~Path (positional)~~                                                                            |
+| `output-file`, `-o`                          | Output `DocBin` path.  ~~str (positional)~~                                                                                  |
+| `--code`, `-c` <Tag variant="new">3</Tag> | Path to Python file with additional code to be imported. Allows [registering custom functions](/usage/training#custom-functions) for new architectures. ~~Optional[Path] \(option)~~ |
+| `--text-key`, `-tk`                    | The key for `.jsonl` files to use to grab the texts from. ~~Optional[str] \(option)~~                                                                                                                                              |
+| `--force-overwrite`, `-F`                    | If the provided `output-file` already exists, then force `apply` to overwrite it. If this is `False` (default) then quits with a warning instead. ~~bool (flag)~~                                                                                                                                              |
+| `--gpu-id`, `-g`                          | GPU to use, if any. Defaults to `-1` for CPU. ~~int (option)~~                                                                                                                       |
+| `--batch-size`, `-g`                          | Batch size to use for prediction. Defaults to `1`. ~~int (option)~~                                                                                                                       |
+| `--n-process`, `-g`                          | Number of processes to use for prediction. Defaults to `1`. ~~int (option)~~                                                                                                                       |
+| `--help`, `-h`                            | Show help message and available arguments. ~~bool (flag)~~                                                                                                                           |
+| **CREATES**                               | A `DocBin` with the annotations from the `model` for all the files found in `data-path`.                                                                                                                            |
+
 ## find-threshold {#find-threshold new="3.5" tag="command"}
 
 Runs prediction trials for a trained model with varying tresholds to maximize