diff --git a/website/docs/api/cli.mdx b/website/docs/api/cli.mdx
index d5b2d54d2..c7beba6db 100644
--- a/website/docs/api/cli.mdx
+++ b/website/docs/api/cli.mdx
@@ -1060,7 +1060,6 @@ $ python -m spacy train [config_path] [--output] [--code] [--verbose] [--gpu-id]
 | `--code`, `-c`    | Path to Python file with additional code to be imported. Allows [registering custom functions](/usage/training#custom-functions) for new architectures. ~~Optional[Path] \(option)~~                               |
 | `--verbose`, `-V` | Show more detailed messages during training. ~~bool (flag)~~                                                                                                                                                       |
 | `--gpu-id`, `-g`  | GPU ID or `-1` for CPU. Defaults to `-1`. ~~int (option)~~                                                                                                                                                         |
-| `--use_rehearse`, `-r`  | Use 'rehearsal' updates on a pre-trained model to address the catastrophic forgetting problem. Defaults to `False`. ~~bool (flag)~~                                                                                                                                                         |                                                                                                                                                     
 | `--help`, `-h`    | Show help message and available arguments. ~~bool (flag)~~                                                                                                                                                         |
 | overrides         | Config parameters to override. Should be options starting with `--` that correspond to the config section and value to override, e.g. `--paths.train ./train.spacy`. ~~Any (option/flag)~~                         |
 | **CREATES**       | The final trained pipeline and the best trained pipeline.                                                                                                                                                          |
diff --git a/website/docs/api/data-formats.mdx b/website/docs/api/data-formats.mdx
index c9d88f87c..41ef131d9 100644
--- a/website/docs/api/data-formats.mdx
+++ b/website/docs/api/data-formats.mdx
@@ -192,6 +192,7 @@ process that are used when you run [`spacy train`](/api/cli#train).
 | `eval_frequency`                                     | How often to evaluate during training (steps). Defaults to `200`. ~~int~~                                                                                                                                                                                                                                                           |
 | `frozen_components`                                  | Pipeline component names that are "frozen" and shouldn't be initialized or updated during training. See [here](/usage/training#config-components) for details. Defaults to `[]`. ~~List[str]~~                                                                                                                                      |
 | `annotating_components` <Tag variant="new">3.1</Tag> | Pipeline component names that should set annotations on the predicted docs during training. See [here](/usage/training#annotating-components) for details. Defaults to `[]`. ~~List[str]~~                                                                                                                                          |
+| `rehearse_components` <Tag variant="new">3.5.1</Tag> | Pipeline component names that should get rehearsed during training. See [here](/usage/training#rehearse-components) for details. Defaults to `[]`. ~~List[str]~~                                                                                                                                          |
 | `gpu_allocator`                                      | Library for cupy to route GPU memory allocation to. Can be `"pytorch"` or `"tensorflow"`. Defaults to variable `${system.gpu_allocator}`. ~~str~~                                                                                                                                                                                   |
 | `logger`                                             | Callable that takes the `nlp` and stdout and stderr `IO` objects, sets up the logger, and returns two new callables to log a training step and to finalize the logger. Defaults to [`ConsoleLogger`](/api/top-level#ConsoleLogger). ~~Callable[[Language, IO, IO], [Tuple[Callable[[Dict[str, Any]], None], Callable[[], None]]]]~~ |
 | `max_epochs`                                         | Maximum number of epochs to train for. `0` means an unlimited number of epochs. `-1` means that the train corpus should be streamed rather than loaded into memory with no shuffling within the training loop. Defaults to `0`. ~~int~~                                                                                             |
diff --git a/website/docs/usage/training.mdx b/website/docs/usage/training.mdx
index 6cda975cb..0a05037c5 100644
--- a/website/docs/usage/training.mdx
+++ b/website/docs/usage/training.mdx
@@ -575,6 +575,29 @@ now-updated model to the predicted docs.
 
 </Infobox>
 
+### Using rehearsing to address catastrophic forgetting {id="rehearse-components", tag="experimental", version="3.5.1"}
+
+Perform “rehearsal” updates to pre-trained components. Rehearsal updates teach the current component to make predictions similar to an initial model, to try to address the “catastrophic forgetting” problem. This feature is experimental.
+
+```ini {title="config.cfg (excerpt)"}
+[nlp]
+pipeline = ["sentencizer", "ner", "entity_linker"]
+
+[components.ner]
+source = "en_core_web_sm"
+
+[training]
+rehearse_components = ["ner"]
+```
+
+<Infobox variant="warning" title="Loss calculation" id="rehearse-components-loss">
+
+Be aware that the loss is calculated by the sum of both the `update` and `rehearse` function. 
+If both the loss and accuracy of the component increases over time, it can be caused due to the trained component making more different predictions that the inital model,
+indicating `catastrophic forgetting`.
+
+</Infobox>
+
 ### Using registered functions {id="config-functions"}
 
 The training configuration defined in the config file doesn't have to only