diff --git a/website/docs/api/cli.mdx b/website/docs/api/cli.mdx
index d5b2d54d2..c7beba6db 100644
--- a/website/docs/api/cli.mdx
+++ b/website/docs/api/cli.mdx
@@ -1060,7 +1060,6 @@ $ python -m spacy train [config_path] [--output] [--code] [--verbose] [--gpu-id]
| `--code`, `-c` | Path to Python file with additional code to be imported. Allows [registering custom functions](/usage/training#custom-functions) for new architectures. ~~Optional[Path] \(option)~~ |
| `--verbose`, `-V` | Show more detailed messages during training. ~~bool (flag)~~ |
| `--gpu-id`, `-g` | GPU ID or `-1` for CPU. Defaults to `-1`. ~~int (option)~~ |
-| `--use_rehearse`, `-r` | Use 'rehearsal' updates on a pre-trained model to address the catastrophic forgetting problem. Defaults to `False`. ~~bool (flag)~~ |
| `--help`, `-h` | Show help message and available arguments. ~~bool (flag)~~ |
| overrides | Config parameters to override. Should be options starting with `--` that correspond to the config section and value to override, e.g. `--paths.train ./train.spacy`. ~~Any (option/flag)~~ |
| **CREATES** | The final trained pipeline and the best trained pipeline. |
diff --git a/website/docs/api/data-formats.mdx b/website/docs/api/data-formats.mdx
index c9d88f87c..41ef131d9 100644
--- a/website/docs/api/data-formats.mdx
+++ b/website/docs/api/data-formats.mdx
@@ -192,6 +192,7 @@ process that are used when you run [`spacy train`](/api/cli#train).
| `eval_frequency` | How often to evaluate during training (steps). Defaults to `200`. ~~int~~ |
| `frozen_components` | Pipeline component names that are "frozen" and shouldn't be initialized or updated during training. See [here](/usage/training#config-components) for details. Defaults to `[]`. ~~List[str]~~ |
| `annotating_components` 3.1 | Pipeline component names that should set annotations on the predicted docs during training. See [here](/usage/training#annotating-components) for details. Defaults to `[]`. ~~List[str]~~ |
+| `rehearse_components` 3.5.1 | Pipeline component names that should get rehearsed during training. See [here](/usage/training#rehearse-components) for details. Defaults to `[]`. ~~List[str]~~ |
| `gpu_allocator` | Library for cupy to route GPU memory allocation to. Can be `"pytorch"` or `"tensorflow"`. Defaults to variable `${system.gpu_allocator}`. ~~str~~ |
| `logger` | Callable that takes the `nlp` and stdout and stderr `IO` objects, sets up the logger, and returns two new callables to log a training step and to finalize the logger. Defaults to [`ConsoleLogger`](/api/top-level#ConsoleLogger). ~~Callable[[Language, IO, IO], [Tuple[Callable[[Dict[str, Any]], None], Callable[[], None]]]]~~ |
| `max_epochs` | Maximum number of epochs to train for. `0` means an unlimited number of epochs. `-1` means that the train corpus should be streamed rather than loaded into memory with no shuffling within the training loop. Defaults to `0`. ~~int~~ |
diff --git a/website/docs/usage/training.mdx b/website/docs/usage/training.mdx
index 6cda975cb..0a05037c5 100644
--- a/website/docs/usage/training.mdx
+++ b/website/docs/usage/training.mdx
@@ -575,6 +575,29 @@ now-updated model to the predicted docs.
+### Using rehearsing to address catastrophic forgetting {id="rehearse-components", tag="experimental", version="3.5.1"}
+
+Perform “rehearsal” updates to pre-trained components. Rehearsal updates teach the current component to make predictions similar to an initial model, to try to address the “catastrophic forgetting” problem. This feature is experimental.
+
+```ini {title="config.cfg (excerpt)"}
+[nlp]
+pipeline = ["sentencizer", "ner", "entity_linker"]
+
+[components.ner]
+source = "en_core_web_sm"
+
+[training]
+rehearse_components = ["ner"]
+```
+
+
+
+Be aware that the loss is calculated by the sum of both the `update` and `rehearse` function.
+If both the loss and accuracy of the component increases over time, it can be caused due to the trained component making more different predictions that the inital model,
+indicating `catastrophic forgetting`.
+
+
+
### Using registered functions {id="config-functions"}
The training configuration defined in the config file doesn't have to only