diff --git a/website/docs/api/architectures.md b/website/docs/api/architectures.md
index 73631c64a..cc6f44fcc 100644
--- a/website/docs/api/architectures.md
+++ b/website/docs/api/architectures.md
@@ -274,7 +274,7 @@ architectures into your training config.
 | `get_spans`        | `Callable`       | Function that takes a batch of [`Doc`](/api/doc) object and returns lists of [`Span`](/api) objects to process by the transformer. [See here](/api/transformer#span_getters) for built-in options and examples. |
 | `tokenizer_config` | `Dict[str, Any]` | Tokenizer settings passed to [`transformers.AutoTokenizer`](https://huggingface.co/transformers/model_doc/auto.html#transformers.AutoTokenizer).                                                                |
 
-### spacy-transformers.Tok2VecListener.v1 {#Tok2VecListener}
+### spacy-transformers.Tok2VecListener.v1 {#transformers-Tok2VecListener}
 
 > #### Example Config
 >
diff --git a/website/docs/api/cli.md b/website/docs/api/cli.md
index 5c971effa..32aaee7b8 100644
--- a/website/docs/api/cli.md
+++ b/website/docs/api/cli.md
@@ -43,7 +43,7 @@ $ python -m spacy download [model] [--direct] [pip args]
 
 | Argument                              | Type       | Description                                                                                                                                                                                                    |
 | ------------------------------------- | ---------- | -------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
-| `model`                               | positional | Model name, e.g. `en_core_web_sm`..                                                                                                                                                                            |
+| `model`                               | positional | Model name, e.g. [`en_core_web_sm`](/models/en#en_core_web_sm).                                                                                                                                                |
 | `--direct`, `-d`                      | flag       | Force direct download of exact model version.                                                                                                                                                                  |
 | pip args <Tag variant="new">2.1</Tag> | -          | Additional installation options to be passed to `pip install` when installing the model package. For example, `--user` to install to the user home directory or `--no-deps` to not install model dependencies. |
 | `--help`, `-h`                        | flag       | Show help message and available arguments.                                                                                                                                                                     |
diff --git a/website/docs/api/data-formats.md b/website/docs/api/data-formats.md
index af7cb26de..32633330e 100644
--- a/website/docs/api/data-formats.md
+++ b/website/docs/api/data-formats.md
@@ -182,10 +182,10 @@ run [`spacy pretrain`](/api/cli#pretrain).
 > ```
 
 The main data format used in spaCy v3.0 is a **binary format** created by
-serializing a [`DocBin`](/api/docbin) object, which represents a collection of
-`Doc` objects. This means that you can train spaCy models using the same format
-it outputs: annotated `Doc` objects. The binary format is extremely **efficient
-in storage**, especially when packing multiple documents together.
+serializing a [`DocBin`](/api/docbin), which represents a collection of `Doc`
+objects. This means that you can train spaCy models using the same format it
+outputs: annotated `Doc` objects. The binary format is extremely **efficient in
+storage**, especially when packing multiple documents together.
 
 Typically, the extension for these binary files is `.spacy`, and they are used
 as input format for specifying a [training corpus](/api/corpus) and for spaCy's
diff --git a/website/docs/api/dependencyparser.md b/website/docs/api/dependencyparser.md
index 187abfdbb..c7af8ffae 100644
--- a/website/docs/api/dependencyparser.md
+++ b/website/docs/api/dependencyparser.md
@@ -142,14 +142,20 @@ applied to the `Doc` in order. Both [`__call__`](/api/dependencyparser#call) and
 
 ## DependencyParser.begin_training {#begin_training tag="method"}
 
-Initialize the pipe for training, using data examples if available. Returns an
-[`Optimizer`](https://thinc.ai/docs/api-optimizers) object.
+Initialize the component for training and return an
+[`Optimizer`](https://thinc.ai/docs/api-optimizers). `get_examples` should be a
+function that returns an iterable of [`Example`](/api/example) objects. The data
+examples are used to **initialize the model** of the component and can either be
+the full training data or a representative sample. Initialization includes
+validating the network,
+[inferring missing shapes](https://thinc.ai/docs/usage-models#validation) and
+setting up the label scheme based on the data.
 
 > #### Example
 >
 > ```python
 > parser = nlp.add_pipe("parser")
-> optimizer = parser.begin_training(pipeline=nlp.pipeline)
+> optimizer = parser.begin_training(lambda: [], pipeline=nlp.pipeline)
 > ```
 
 | Name           | Type                                                | Description                                                                                                         |
diff --git a/website/docs/api/entitylinker.md b/website/docs/api/entitylinker.md
index 930188e26..fa8918dba 100644
--- a/website/docs/api/entitylinker.md
+++ b/website/docs/api/entitylinker.md
@@ -142,14 +142,20 @@ applied to the `Doc` in order. Both [`__call__`](/api/entitylinker#call) and
 
 ## EntityLinker.begin_training {#begin_training tag="method"}
 
-Initialize the pipe for training, using data examples if available. Returns an
-[`Optimizer`](https://thinc.ai/docs/api-optimizers) object.
+Initialize the component for training and return an
+[`Optimizer`](https://thinc.ai/docs/api-optimizers). `get_examples` should be a
+function that returns an iterable of [`Example`](/api/example) objects. The data
+examples are used to **initialize the model** of the component and can either be
+the full training data or a representative sample. Initialization includes
+validating the network,
+[inferring missing shapes](https://thinc.ai/docs/usage-models#validation) and
+setting up the label scheme based on the data.
 
 > #### Example
 >
 > ```python
 > entity_linker = nlp.add_pipe("entity_linker", last=True)
-> optimizer = entity_linker.begin_training(pipeline=nlp.pipeline)
+> optimizer = entity_linker.begin_training(lambda: [], pipeline=nlp.pipeline)
 > ```
 
 | Name           | Type                                                | Description                                                                                                         |
diff --git a/website/docs/api/entityrecognizer.md b/website/docs/api/entityrecognizer.md
index 2d66710d7..8d30463ff 100644
--- a/website/docs/api/entityrecognizer.md
+++ b/website/docs/api/entityrecognizer.md
@@ -131,14 +131,20 @@ applied to the `Doc` in order. Both [`__call__`](/api/entityrecognizer#call) and
 
 ## EntityRecognizer.begin_training {#begin_training tag="method"}
 
-Initialize the pipe for training, using data examples if available. Returns an
-[`Optimizer`](https://thinc.ai/docs/api-optimizers) object.
+Initialize the component for training and return an
+[`Optimizer`](https://thinc.ai/docs/api-optimizers). `get_examples` should be a
+function that returns an iterable of [`Example`](/api/example) objects. The data
+examples are used to **initialize the model** of the component and can either be
+the full training data or a representative sample. Initialization includes
+validating the network,
+[inferring missing shapes](https://thinc.ai/docs/usage-models#validation) and
+setting up the label scheme based on the data.
 
 > #### Example
 >
 > ```python
 > ner = nlp.add_pipe("ner")
-> optimizer = ner.begin_training(pipeline=nlp.pipeline)
+> optimizer = ner.begin_training(lambda: [], pipeline=nlp.pipeline)
 > ```
 
 | Name           | Type                                                | Description                                                                                                         |
diff --git a/website/docs/api/language.md b/website/docs/api/language.md
index 79782fd72..41d660421 100644
--- a/website/docs/api/language.md
+++ b/website/docs/api/language.md
@@ -200,12 +200,28 @@ more efficient than processing texts one-by-one.
 
 ## Language.begin_training {#begin_training tag="method"}
 
-Initialize the pipe for training, using data examples if available. Returns an
-[`Optimizer`](https://thinc.ai/docs/api-optimizers) object.
+Initialize the pipeline for training and return an
+[`Optimizer`](https://thinc.ai/docs/api-optimizers). `get_examples` should be a
+function that returns an iterable of [`Example`](/api/example) objects. The data
+examples can either be the full training data or a representative sample. They
+are used to **initialize the models** of trainable pipeline components and are
+passed each component's [`begin_training`](/api/pipe#begin_training) method, if
+available. Initialization includes validating the network,
+[inferring missing shapes](https://thinc.ai/docs/usage-models#validation) and
+setting up the label scheme based on the data.
+
+<Infobox variant="warning" title="Changed in v3.0">
+
+The `Language.update` method now takes a **function** that is called with no
+arguments and returns a sequence of [`Example`](/api/example) objects instead of
+tuples of `Doc` and `GoldParse` objects.
+
+</Infobox>
 
 > #### Example
 >
 > ```python
+> get_examples = lambda: examples
 > optimizer = nlp.begin_training(get_examples)
 > ```
 
@@ -276,7 +292,7 @@ and custom registered functions if needed. See the
 | `component_cfg` | `Dict[str, dict]`                                   | Optional dictionary of keyword arguments for components, keyed by component names. Defaults to `None`. |
 | **RETURNS**     | `Dict[str, float]`                                  | The updated `losses` dictionary.                                                                       |
 
-## Language.rehearse {#rehearse tag="method,experimental"}
+## Language.rehearse {#rehearse tag="method,experimental" new="3"}
 
 Perform a "rehearsal" update from a batch of data. Rehearsal updates teach the
 current model to make predictions similar to an initial model, to try to address
@@ -302,6 +318,13 @@ the "catastrophic forgetting" problem. This feature is experimental.
 
 Evaluate a model's pipeline components.
 
+<Infobox variant="warning" title="Changed in v3.0">
+
+The `Language.update` method now takes a batch of [`Example`](/api/example)
+objects instead of tuples of `Doc` and `GoldParse` objects.
+
+</Infobox>
+
 > #### Example
 >
 > ```python
diff --git a/website/docs/api/morphologizer.md b/website/docs/api/morphologizer.md
index 04d189939..12d3050f6 100644
--- a/website/docs/api/morphologizer.md
+++ b/website/docs/api/morphologizer.md
@@ -121,15 +121,21 @@ applied to the `Doc` in order. Both [`__call__`](/api/morphologizer#call) and
 
 ## Morphologizer.begin_training {#begin_training tag="method"}
 
-Initialize the pipe for training, using data examples if available. Returns an
-[`Optimizer`](https://thinc.ai/docs/api-optimizers) object.
+Initialize the component for training and return an
+[`Optimizer`](https://thinc.ai/docs/api-optimizers). `get_examples` should be a
+function that returns an iterable of [`Example`](/api/example) objects. The data
+examples are used to **initialize the model** of the component and can either be
+the full training data or a representative sample. Initialization includes
+validating the network,
+[inferring missing shapes](https://thinc.ai/docs/usage-models#validation) and
+setting up the label scheme based on the data.
 
 > #### Example
 >
 > ```python
 > morphologizer = nlp.add_pipe("morphologizer")
 > nlp.pipeline.append(morphologizer)
-> optimizer = morphologizer.begin_training(pipeline=nlp.pipeline)
+> optimizer = morphologizer.begin_training(lambda: [], pipeline=nlp.pipeline)
 > ```
 
 | Name           | Type                                                | Description                                                                                                           |
diff --git a/website/docs/api/pipe.md b/website/docs/api/pipe.md
index b41ec210e..81ecc5faf 100644
--- a/website/docs/api/pipe.md
+++ b/website/docs/api/pipe.md
@@ -9,8 +9,8 @@ components like the [`EntityRecognizer`](/api/entityrecognizer) or
 [`TextCategorizer`](/api/textcategorizer) inherit from it and it defines the
 interface that components should follow to function as trainable components in a
 spaCy pipeline. See the docs on
-[writing trainable components](/usage/processing-pipelines#trainable) for how to
-use the `Pipe` base class to implement custom components.
+[writing trainable components](/usage/processing-pipelines#trainable-components)
+for how to use the `Pipe` base class to implement custom components.
 
 > #### Why is Pipe implemented in Cython?
 >
@@ -106,14 +106,20 @@ applied to the `Doc` in order. Both [`__call__`](/api/pipe#call) and
 
 ## Pipe.begin_training {#begin_training tag="method"}
 
-Initialize the pipe for training, using data examples if available. Returns an
-[`Optimizer`](https://thinc.ai/docs/api-optimizers) object.
+Initialize the component for training and return an
+[`Optimizer`](https://thinc.ai/docs/api-optimizers). `get_examples` should be a
+function that returns an iterable of [`Example`](/api/example) objects. The data
+examples are used to **initialize the model** of the component and can either be
+the full training data or a representative sample. Initialization includes
+validating the network,
+[inferring missing shapes](https://thinc.ai/docs/usage-models#validation) and
+setting up the label scheme based on the data.
 
 > #### Example
 >
 > ```python
 > pipe = nlp.add_pipe("your_custom_pipe")
-> optimizer = pipe.begin_training(pipeline=nlp.pipeline)
+> optimizer = pipe.begin_training(lambda: [], pipeline=nlp.pipeline)
 > ```
 
 | Name           | Type                                                | Description                                                                                                |
@@ -200,7 +206,7 @@ This method needs to be overwritten with your own custom `update` method.
 | `losses`          | `Dict[str, float]`                                  | Optional record of the loss during training. Updated using the component name as the key.                                          |
 | **RETURNS**       | `Dict[str, float]`                                  | The updated `losses` dictionary.                                                                                                   |
 
-## Pipe.rehearse {#rehearse tag="method,experimental"}
+## Pipe.rehearse {#rehearse tag="method,experimental" new="3"}
 
 Perform a "rehearsal" update from a batch of data. Rehearsal updates teach the
 current model to make predictions similar to an initial model, to try to address
diff --git a/website/docs/api/sentencerecognizer.md b/website/docs/api/sentencerecognizer.md
index 59ada7fcb..cefdbea88 100644
--- a/website/docs/api/sentencerecognizer.md
+++ b/website/docs/api/sentencerecognizer.md
@@ -116,14 +116,20 @@ and [`pipe`](/api/sentencerecognizer#pipe) delegate to the
 
 ## SentenceRecognizer.begin_training {#begin_training tag="method"}
 
-Initialize the pipe for training, using data examples if available. Returns an
-[`Optimizer`](https://thinc.ai/docs/api-optimizers) object.
+Initialize the component for training and return an
+[`Optimizer`](https://thinc.ai/docs/api-optimizers). `get_examples` should be a
+function that returns an iterable of [`Example`](/api/example) objects. The data
+examples are used to **initialize the model** of the component and can either be
+the full training data or a representative sample. Initialization includes
+validating the network,
+[inferring missing shapes](https://thinc.ai/docs/usage-models#validation) and
+setting up the label scheme based on the data.
 
 > #### Example
 >
 > ```python
 > senter = nlp.add_pipe("senter")
-> optimizer = senter.begin_training(pipeline=nlp.pipeline)
+> optimizer = senter.begin_training(lambda: [], pipeline=nlp.pipeline)
 > ```
 
 | Name           | Type                                                | Description                                                                                                           |
@@ -193,7 +199,7 @@ Delegates to [`predict`](/api/sentencerecognizer#predict) and
 | `losses`          | `Dict[str, float]`                                  | Optional record of the loss during training. The value keyed by the model's name is updated.                                                     |
 | **RETURNS**       | `Dict[str, float]`                                  | The updated `losses` dictionary.                                                                                                                 |
 
-## SentenceRecognizer.rehearse {#rehearse tag="method,experimental"}
+## SentenceRecognizer.rehearse {#rehearse tag="method,experimental" new="3"}
 
 Perform a "rehearsal" update from a batch of data. Rehearsal updates teach the
 current model to make predictions similar to an initial model, to try to address
diff --git a/website/docs/api/tagger.md b/website/docs/api/tagger.md
index 7ea29e53c..9761dea15 100644
--- a/website/docs/api/tagger.md
+++ b/website/docs/api/tagger.md
@@ -114,14 +114,20 @@ applied to the `Doc` in order. Both [`__call__`](/api/tagger#call) and
 
 ## Tagger.begin_training {#begin_training tag="method"}
 
-Initialize the pipe for training, using data examples if available. Returns an
-[`Optimizer`](https://thinc.ai/docs/api-optimizers) object.
+Initialize the component for training and return an
+[`Optimizer`](https://thinc.ai/docs/api-optimizers). `get_examples` should be a
+function that returns an iterable of [`Example`](/api/example) objects. The data
+examples are used to **initialize the model** of the component and can either be
+the full training data or a representative sample. Initialization includes
+validating the network,
+[inferring missing shapes](https://thinc.ai/docs/usage-models#validation) and
+setting up the label scheme based on the data.
 
 > #### Example
 >
 > ```python
 > tagger = nlp.add_pipe("tagger")
-> optimizer = tagger.begin_training(pipeline=nlp.pipeline)
+> optimizer = tagger.begin_training(lambda: [], pipeline=nlp.pipeline)
 > ```
 
 | Name           | Type                                                | Description                                                                                                |
@@ -191,7 +197,7 @@ Delegates to [`predict`](/api/tagger#predict) and
 | `losses`          | `Dict[str, float]`                                  | Optional record of the loss during training. The value keyed by the model's name is updated.                                         |
 | **RETURNS**       | `Dict[str, float]`                                  | The updated `losses` dictionary.                                                                                                     |
 
-## Tagger.rehearse {#rehearse tag="method,experimental"}
+## Tagger.rehearse {#rehearse tag="method,experimental" new="3"}
 
 Perform a "rehearsal" update from a batch of data. Rehearsal updates teach the
 current model to make predictions similar to an initial model, to try to address
diff --git a/website/docs/api/textcategorizer.md b/website/docs/api/textcategorizer.md
index 494bc569f..73b50b865 100644
--- a/website/docs/api/textcategorizer.md
+++ b/website/docs/api/textcategorizer.md
@@ -122,14 +122,20 @@ applied to the `Doc` in order. Both [`__call__`](/api/textcategorizer#call) and
 
 ## TextCategorizer.begin_training {#begin_training tag="method"}
 
-Initialize the pipe for training, using data examples if available. Returns an
-[`Optimizer`](https://thinc.ai/docs/api-optimizers) object.
+Initialize the component for training and return an
+[`Optimizer`](https://thinc.ai/docs/api-optimizers). `get_examples` should be a
+function that returns an iterable of [`Example`](/api/example) objects. The data
+examples are used to **initialize the model** of the component and can either be
+the full training data or a representative sample. Initialization includes
+validating the network,
+[inferring missing shapes](https://thinc.ai/docs/usage-models#validation) and
+setting up the label scheme based on the data.
 
 > #### Example
 >
 > ```python
 > textcat = nlp.add_pipe("textcat")
-> optimizer = textcat.begin_training(pipeline=nlp.pipeline)
+> optimizer = textcat.begin_training(lambda: [], pipeline=nlp.pipeline)
 > ```
 
 | Name           | Type                                                | Description                                                                                                        |
@@ -199,7 +205,7 @@ Delegates to [`predict`](/api/textcategorizer#predict) and
 | `losses`          | `Dict[str, float]`                                  | Optional record of the loss during training. Updated using the component name as the key.                                                     |
 | **RETURNS**       | `Dict[str, float]`                                  | The updated `losses` dictionary.                                                                                                              |
 
-## TextCategorizer.rehearse {#rehearse tag="method,experimental"}
+## TextCategorizer.rehearse {#rehearse tag="method,experimental" new="3"}
 
 Perform a "rehearsal" update from a batch of data. Rehearsal updates teach the
 current model to make predictions similar to an initial model, to try to address
diff --git a/website/docs/api/tok2vec.md b/website/docs/api/tok2vec.md
index 8e5f78bf7..4c820c07c 100644
--- a/website/docs/api/tok2vec.md
+++ b/website/docs/api/tok2vec.md
@@ -125,14 +125,20 @@ and [`set_annotations`](/api/tok2vec#set_annotations) methods.
 
 ## Tok2Vec.begin_training {#begin_training tag="method"}
 
-Initialize the pipe for training, using data examples if available. Returns an
-[`Optimizer`](https://thinc.ai/docs/api-optimizers) object.
+Initialize the component for training and return an
+[`Optimizer`](https://thinc.ai/docs/api-optimizers). `get_examples` should be a
+function that returns an iterable of [`Example`](/api/example) objects. The data
+examples are used to **initialize the model** of the component and can either be
+the full training data or a representative sample. Initialization includes
+validating the network,
+[inferring missing shapes](https://thinc.ai/docs/usage-models#validation) and
+setting up the label scheme based on the data.
 
 > #### Example
 >
 > ```python
 > tok2vec = nlp.add_pipe("tok2vec")
-> optimizer = tok2vec.begin_training(pipeline=nlp.pipeline)
+> optimizer = tok2vec.begin_training(lambda: [], pipeline=nlp.pipeline)
 > ```
 
 | Name           | Type                                                | Description                                                                                                |
diff --git a/website/docs/api/transformer.md b/website/docs/api/transformer.md
index a8b328688..dda212906 100644
--- a/website/docs/api/transformer.md
+++ b/website/docs/api/transformer.md
@@ -159,14 +159,20 @@ applied to the `Doc` in order. Both [`__call__`](/api/transformer#call) and
 
 ## Transformer.begin_training {#begin_training tag="method"}
 
-Initialize the pipe for training, using data examples if available. Returns an
-[`Optimizer`](https://thinc.ai/docs/api-optimizers) object.
+Initialize the component for training and return an
+[`Optimizer`](https://thinc.ai/docs/api-optimizers). `get_examples` should be a
+function that returns an iterable of [`Example`](/api/example) objects. The data
+examples are used to **initialize the model** of the component and can either be
+the full training data or a representative sample. Initialization includes
+validating the network,
+[inferring missing shapes](https://thinc.ai/docs/usage-models#validation) and
+setting up the label scheme based on the data.
 
 > #### Example
 >
 > ```python
 > trf = nlp.add_pipe("transformer")
-> optimizer = trf.begin_training(pipeline=nlp.pipeline)
+> optimizer = trf.begin_training(lambda: [], pipeline=nlp.pipeline)
 > ```
 
 | Name           | Type                                                | Description                                                                                                    |
diff --git a/website/docs/models/index.md b/website/docs/models/index.md
index b25e46f1e..d5f87d3b5 100644
--- a/website/docs/models/index.md
+++ b/website/docs/models/index.md
@@ -45,9 +45,9 @@ three components:
 2. **Genre:** Type of text the model is trained on, e.g. `web` or `news`.
 3. **Size:** Model size indicator, `sm`, `md` or `lg`.
 
-For example, `en_core_web_sm` is a small English model trained on written web
-text (blogs, news, comments), that includes vocabulary, vectors, syntax and
-entities.
+For example, [`en_core_web_sm`](/models/en#en_core_web_sm) is a small English
+model trained on written web text (blogs, news, comments), that includes
+vocabulary, vectors, syntax and entities.
 
 ### Model versioning {#model-versioning}
 
diff --git a/website/docs/usage/training.md b/website/docs/usage/training.md
index e6d328d02..a3a2e7102 100644
--- a/website/docs/usage/training.md
+++ b/website/docs/usage/training.md
@@ -687,13 +687,13 @@ give you everything you need to train fully custom models with
 
 </Infobox>
 
-<!-- TODO: maybe add something about why the Example class is great and its benefits, and how it's passed around, holds the alignment etc -->
-
 The [`Example`](/api/example) object contains annotated training data, also
 called the **gold standard**. It's initialized with a [`Doc`](/api/doc) object
 that will hold the predictions, and another `Doc` object that holds the
-gold-standard annotations. Here's an example of a simple `Example` for
-part-of-speech tags:
+gold-standard annotations. It also includes the **alignment** between those two
+documents if they differ in tokenization. The `Example` class ensures that spaCy
+can rely on one **standardized format** that's passed through the pipeline.
+Here's an example of a simple `Example` for part-of-speech tags:
 
 ```python
 words = ["I", "like", "stuff"]
@@ -744,7 +744,8 @@ example = Example.from_dict(doc, {"entities": ["U-ORG", "O", "U-TECHNOLOGY", "O"
 
 As of v3.0, the [`Example`](/api/example) object replaces the `GoldParse` class.
 It can be constructed in a very similar way, from a `Doc` and a dictionary of
-annotations:
+annotations. For more details, see the
+[migration guide](/usage/v3#migrating-training).
 
 ```diff
 - gold = GoldParse(doc, entities=entities)
diff --git a/website/docs/usage/v3.md b/website/docs/usage/v3.md
index 02f6882e4..919af3ffb 100644
--- a/website/docs/usage/v3.md
+++ b/website/docs/usage/v3.md
@@ -14,12 +14,49 @@ menu:
 
 ### New training workflow and config system {#features-training}
 
+<Infobox title="Details & Documentation" emoji="📖" list>
+
+- **Usage:** [Training models](/usage/training)
+- **Thinc:** [Thinc's config system](https://thinc.ai/docs/usage-config),
+  [`Config`](https://thinc.ai/docs/api-config#config)
+- **CLI:** [`train`](/api/cli#train), [`pretrain`](/api/cli#pretrain),
+  [`evaluate`](/api/cli#evaluate)
+- **API:** [Config format](/api/data-formats#config),
+  [`registry`](/api/top-level#registry)
+
+</Infobox>
+
 ### Transformer-based pipelines {#features-transformers}
 
+<Infobox title="Details & Documentation" emoji="📖" list>
+
+- **Usage:** [Transformers](/usage/transformers),
+  [Training models](/usage/training)
+- **API:** [`Transformer`](/api/transformer),
+  [`TransformerData`](/api/transformer#transformerdata),
+  [`FullTransformerBatch`](/api/transformer#fulltransformerbatch)
+- **Architectures: ** [TransformerModel](/api/architectures#TransformerModel),
+  [Tok2VecListener](/api/architectures#transformers-Tok2VecListener),
+  [Tok2VecTransformer](/api/architectures#Tok2VecTransformer)
+- **Models:** [`en_core_bert_sm`](/models/en)
+- **Implementation:**
+  [`spacy-transformers`](https://github.com/explosion/spacy-transformers)
+
+</Infobox>
+
 ### Custom models using any framework {#feautres-custom-models}
 
 ### Manage end-to-end workflows with projects {#features-projects}
 
+<Infobox title="Details & Documentation" emoji="📖" list>
+
+- **Usage:** [spaCy projects](/usage/projects),
+  [Training models](/usage/training)
+- **CLI:** [`project`](/api/cli#project), [`train`](/api/cli#train)
+- **Templates:** [`projects`](https://github.com/explosion/projects)
+
+</Infobox>
+
 ### New built-in pipeline components {#features-pipeline-components}
 
 | Name                                            | Description                                                                                                                                                                                                  |
@@ -30,14 +67,48 @@ menu:
 | [`AttributeRuler`](/api/attributeruler)         | Component for setting token attributes using match patterns.                                                                                                                                                 |
 | [`Transformer`](/api/transformer)               | Component for using [transformer models](/usage/transformers) in your pipeline, accessing outputs and aligning tokens. Provided via [`spacy-transformers`](https://github.com/explosion/spacy-transformers). |
 
+<Infobox title="Details & Documentation" emoji="📖" list>
+
+- **Usage:** [Processing pipelines](/usage/processing-pipelines)
+- **API:** [Built-in pipeline components](/api#architecture-pipeline)
+- **Implementation:**
+  [`spacy/pipeline`](https://github.com/explosion/spaCy/tree/develop/spacy/pipeline)
+
+</Infobox>
+
 ### New and improved pipeline component APIs {#features-components}
 
 - `Language.factory`, `Language.component`
 - `Language.analyze_pipes`
 - Adding components from other models
 
+<Infobox title="Details & Documentation" emoji="📖" list>
+
+- **Usage:** [Custom components](/usage/processing-pipelines#custom_components),
+  [Defining components during training](/usage/training#config-components)
+- **API:** [`Language`](/api/language)
+- **Implementation:**
+  [`spacy/language.py`](https://github.com/explosion/spaCy/tree/develop/spacy/language.py)
+
+</Infobox>
+
 ### Type hints and type-based data validation {#features-types}
 
+> #### Example
+>
+> ```python
+> from spacy.language import Language
+> from pydantic import StrictBool
+>
+> @Language.factory("my_component")
+> def create_my_component(
+>     nlp: Language,
+>     name: str,
+>     custom: StrictBool
+> ):
+>    ...
+> ```
+
 spaCy v3.0 officially drops support for Python 2 and now requires **Python
 3.6+**. This also means that the code base can take full advantage of
 [type hints](https://docs.python.org/3/library/typing.html). spaCy's user-facing
@@ -54,13 +125,36 @@ validation of Thinc's [config system](https://thinc.ai/docs/usage-config), which
 lets you to register **custom functions with typed arguments**, reference them
 in your config and see validation errors if the argument values don't match.
 
-### CLI
+<Infobox title="Details & Documentation" emoji="📖" list>
 
-| Name                                    | Description                                                                                              |
-| --------------------------------------- | -------------------------------------------------------------------------------------------------------- |
-| [`init config`](/api/cli#init-config)   | Initialize a [training config](/usage/training) file for a blank language or auto-fill a partial config. |
-| [`debug config`](/api/cli#debug-config) | Debug a [training config](/usage/training) file and show validation errors.                              |
-| [`project`](/api/cli#project)           | Subcommand for cloning and running [spaCy projects](/usage/projects).                                    |
+- **Usage: **
+  [Component type hints and validation](/usage/processing-pipelines#type-hints),
+  [Training with custom code](/usage/training#custom-code)
+- **Thinc: **
+  [Type checking in Thinc](https://thinc.ai/docs/usage-type-checking),
+  [Thinc's config system](https://thinc.ai/docs/usage-config)
+
+</Infobox>
+
+### New methods, attributes and commands
+
+The following methods, attributes and commands are new in spaCy v3.0.
+
+| Name                                                                                                                     | Description                                                                                                                                                                                      |
+| ------------------------------------------------------------------------------------------------------------------------ | ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------ |
+| [`Token.lex`](/api/token#attributes)                                                                                     | Access a token's [`Lexeme`](/api/lexeme).                                                                                                                                                        |
+| [`Language.select_pipes`](/api/language#select_pipes)                                                                    | Contextmanager for enabling or disabling specific pipeline components for a block.                                                                                                               |
+| [`Language.analyze_pipes`](/api/language#analyze_pipes)                                                                  | [Analyze](/usage/processing-pipelines#analysis) components and their interdependencies.                                                                                                          |
+| [`Language.resume_training`](/api/language#resume_training)                                                              | Experimental: continue training a pretrained model and initialize "rehearsal" for components that implement a `rehearse` method to prevent catastrophic forgetting.                              |
+| [`@Language.factory`](/api/language#factory) [`@Language.component`](/api/language#component)                            | Decorators for [registering](/usage/processing-pipelines#custom-components) pipeline component factories and simple stateless component functions.                                               |
+| [`Language.has_factory`](/api/language#has_factory)                                                                      | Check whether a component factory is registered on a language class.s                                                                                                                            |
+| [`Language.get_factory_meta`](/api/language#get_factory_meta) [`Language.get_pipe_meta`](/api/language#get_factory_meta) | Get the [`FactoryMeta`](/api/language#factorymeta) with component metadata for a factory or instance name.                                                                                       |
+| [`Language.config`](/api/language#config)                                                                                | The [config](/usage/training#config) used to create the current `nlp` object. An instance of [`Config`](https://thinc.ai/docs/api-config#config) and can be saved to disk and used for training. |
+| [`Pipe.score`](/api/pipe#score)                                                                                          | Method on trainable pipeline components that returns a dictionary of evaluation scores.                                                                                                          |
+| [`registry`](/api/top-level#registry)                                                                                    | Function registry to map functions to string names that can be referenced in [configs](/usage/training#config).                                                                                  |
+| [`init config`](/api/cli#init-config)                                                                                    | CLI command for initializing a [training config](/usage/training) file for a blank language or auto-filling a partial config.                                                                    |
+| [`debug config`](/api/cli#debug-config)                                                                                  | CLI command for debugging a [training config](/usage/training) file and showing validation errors.                                                                                               |
+| [`project`](/api/cli#project)                                                                                            | Suite of CLI commands for cloning, running and managing [spaCy projects](/usage/projects).                                                                                                       |
 
 ## Backwards Incompatibilities {#incompat}
 
@@ -70,12 +164,21 @@ usability. The following section lists the relevant changes to the user-facing
 API. For specific examples of how to rewrite your code, check out the
 [migration guide](#migrating).
 
-### Compatibility {#incompat-compat}
+<Infobox variant="warning">
 
-- spaCy now requires **Python 3.6+**.
+Note that spaCy v3.0 now requires **Python 3.6+**.
+
+</Infobox>
 
 ### API changes {#incompat-api}
 
+- Model symlinks, the `link` command and shortcut names are now deprecated.
+  There can be many [different models](/models) and not just one "English
+  model", so you should always use the full model name like
+  [`en_core_web_sm`](/models/en) explicitly.
+- The [`train`](/api/cli#train) and [`pretrain`](/api/cli#pretrain) commands now
+  only take a `config.cfg` file containing the full
+  [training config](/usage/training#config).
 - [`Language.add_pipe`](/api/language#add_pipe) now takes the **string name** of
   the component factory instead of the component function.
 - **Custom pipeline components** now needs to be decorated with the
@@ -87,6 +190,20 @@ API. For specific examples of how to rewrite your code, check out the
 - The `Language.disable_pipes` contextmanager has been replaced by
   [`Language.select_pipes`](/api/language#select_pipes), which can explicitly
   disable or enable components.
+- The [`Language.update`](/api/language#update),
+  [`Language.evaluate`](/api/language#evaluate) and
+  [`Pipe.update`](/api/pipe#update) methods now all take batches of
+  [`Example`](/api/example) objects instead of `Doc` and `GoldParse` objects, or
+  raw text and a dictionary of annotations.
+  [`Language.begin_training`](/api/language#begin_training) and
+  [`Pipe.begin_training`](/api/pipe#begin_training) now take a function that
+  returns a sequence of `Example` objects to initialize the model instead of a
+  list of tuples.
+- [`Matcher.add`](/api/matcher#add),
+  [`PhraseMatcher.add`](/api/phrasematcher#add) and
+  [`DependencyMatcher.add`](/api/dependencymatcher#add) now only accept a list
+  of patterns as the second argument (instead of a variable number of
+  arguments). The `on_match` callback becomes an optional keyword argument.
 
 ### Removed or renamed API {#incompat-removed}
 
@@ -96,6 +213,7 @@ API. For specific examples of how to rewrite your code, check out the
 | `GoldParse`                                              | [`Example`](/api/example)                             |
 | `GoldCorpus`                                             | [`Corpus`](/api/corpus)                               |
 | `spacy debug-data`                                       | [`spacy debug data`](/api/cli#debug-data)             |
+| `spacy profile`                                          | [`spacy debug profile`](/api/cli#debug-profile)       |
 | `spacy link`, `util.set_data_path`, `util.get_data_path` | not needed, model symlinks are deprecated             |
 
 The following deprecated methods, attributes and arguments were removed in v3.0.
@@ -121,7 +239,7 @@ on them.
 Model symlinks and shortcuts like `en` are now officially deprecated. There are
 [many different models](/models) with different capabilities and not just one
 "English model". In order to download and load a model, you should always use
-its full name – for instance, `en_core_web_sm`.
+its full name – for instance, [`en_core_web_sm`](/models/en#en_core_web_sm).
 
 ```diff
 - python -m spacy download en
@@ -224,6 +342,51 @@ and you typically shouldn't have to use it in your code.
 + parser = nlp.add_pipe("parser")
 ```
 
+If you need to add a component from an existing pretrained model, you can now
+use the `source` argument on [`nlp.add_pipe`](/api/language#add_pipe). This will
+check that the component is compatible, and take care of porting over all
+config. During training, you can also reference existing pretrained components
+in your [config](/usage/training#config-components) and decide whether or not
+they should be updated with more data.
+
+> #### config.cfg (excerpt)
+>
+> ```ini
+> [components.ner]
+> source = "en_core_web_sm"
+> component = "ner"
+> ```
+
+```diff
+source_nlp = spacy.load("en_core_web_sm")
+nlp = spacy.blank("en")
+- ner = source_nlp.get_pipe("ner")
+- nlp.add_pipe(ner)
++ nlp.add_pipe("ner", source=source_nlp)
+```
+
+### Adding match patterns {#migrating-matcher}
+
+The [`Matcher.add`](/api/matcher#add),
+[`PhraseMatcher.add`](/api/phrasematcher#add) and
+[`DependencyMatcher.add`](/api/dependencymatcher#add) methods now only accept a
+**list of patterns** as the second argument (instead of a variable number of
+arguments). The `on_match` callback becomes an optional keyword argument.
+
+```diff
+matcher = Matcher(nlp.vocab)
+patterns = [[{"TEXT": "Google"}, {"TEXT": "Now"}], [{"TEXT": "GoogleNow"}]]
+- matcher.add("GoogleNow", on_match, *patterns)
++ matcher.add("GoogleNow", patterns, on_match=on_match)
+```
+
+```diff
+matcher = PhraseMatcher(nlp.vocab)
+patterns = [nlp("health care reform"), nlp("healthcare reform")]
+- matcher.add("HEALTH", on_match, *patterns)
++ matcher.add("HEALTH", patterns, on_match=on_match)
+```
+
 ### Training models {#migrating-training}
 
 To train your models, you should now pretty much always use the
@@ -233,15 +396,20 @@ use a [flexible config file](/usage/training#config) that describes all training
 settings and hyperparameters, as well as your pipeline, model components and
 architectures to use. The `--code` argument lets you pass in code containing
 [custom registered functions](/usage/training#custom-code) that you can
-reference in your config.
+reference in your config. To get started, check out the
+[quickstart widget](/usage/training#quickstart).
 
 #### Binary .spacy training data format {#migrating-training-format}
 
-spaCy now uses a new
-[binary training data format](/api/data-formats#binary-training), which is much
-smaller and consists of `Doc` objects, serialized via the
-[`DocBin`](/api/docbin). You can convert your existing JSON-formatted data using
-the [`spacy convert`](/api/cli#convert) command, which outputs `.spacy` files:
+spaCy v3.0 uses a new
+[binary training data format](/api/data-formats#binary-training) created by
+serializing a [`DocBin`](/api/docbin), which represents a collection of `Doc`
+objects. This means that you can train spaCy models using the same format it
+outputs: annotated `Doc` objects. The binary format is extremely **efficient in
+storage**, especially when packing multiple documents together.
+
+You can convert your existing JSON-formatted data using the
+[`spacy convert`](/api/cli#convert) command, which outputs `.spacy` files:
 
 ```bash
 $ python -m spacy convert ./training.json ./output
@@ -273,13 +441,72 @@ workflows, from data preprocessing to training and packaging your model.
 
 </Project>
 
-#### Migrating training scripts to CLI command and config {#migrating-training-scripts}
-
-<!-- TODO: write -->
-
 #### Training via the Python API {#migrating-training-python}
 
-<!-- TODO: this should explain the GoldParse -> Example stuff -->
+For most use cases, you **shouldn't** have to write your own training scripts
+anymore. Instead, you can use [`spacy train`](/api/cli#train) with a
+[config file](/usage/training#config) and custom
+[registered functions](/usage/training#custom-code) if needed. You can even
+register callbacks that can modify the `nlp` object at different stages of its
+lifecycle to fully customize it before training.
+
+If you do decide to use the [internal training API](/usage/training#api) from
+Python, you should only need a few small modifications to convert your scripts
+from spaCy v2.x to v3.x. The [`Example.from_dict`](/api/example#from_dict)
+classmethod takes a reference `Doc` and a
+[dictionary of annotations](/api/data-formats#dict-input), similar to the
+"simple training style" in spaCy v2.x:
+
+```diff
+### Migrating Doc and GoldParse
+doc = nlp.make_doc("Mark Zuckerberg is the CEO of Facebook")
+entities = [(0, 15, "PERSON"), (30, 38, "ORG")]
+- gold = GoldParse(doc, entities=entities)
++ example = Example.from_dict(doc, {"entities": entities})
+```
+
+```diff
+### Migrating simple training style
+text = "Mark Zuckerberg is the CEO of Facebook"
+annotations = {"entities": [(0, 15, "PERSON"), (30, 38, "ORG")]}
++ doc = nlp.make_doc(text)
++ example = Example.from_dict(doc, annotations)
+```
+
+The [`Language.update`](/api/language#update),
+[`Language.evaluate`](/api/language#evaluate) and
+[`Pipe.update`](/api/pipe#update) methods now all take batches of
+[`Example`](/api/example) objects instead of `Doc` and `GoldParse` objects, or
+raw text and a dictionary of annotations.
+
+```python
+### Training loop {highlight="11"}
+TRAIN_DATA = [
+    ("Who is Shaka Khan?", {"entities": [(7, 17, "PERSON")]}),
+    ("I like London.", {"entities": [(7, 13, "LOC")]}),
+]
+nlp.begin_training()
+for i in range(20):
+    random.shuffle(TRAIN_DATA)
+    for batch in minibatch(TRAIN_DATA):
+        examples = []
+        for text, annots in batch:
+            examples.append(Example.from_dict(nlp.make_doc(text), annots))
+        nlp.update(examples)
+```
+
+[`Language.begin_training`](/api/language#begin_training) and
+[`Pipe.begin_training`](/api/pipe#begin_training) now take a function that
+returns a sequence of `Example` objects to initialize the model instead of a
+list of tuples. The data examples are used to **initialize the models** of
+trainable pipeline components, which includes validating the network,
+[inferring missing shapes](https://thinc.ai/docs/usage-models#validation) and
+setting up the label scheme.
+
+```diff
+- nlp.begin_training(examples)
++ nlp.begin_training(lambda: examples)
+```
 
 #### Packaging models {#migrating-training-packaging}
 
diff --git a/website/src/components/icon.js b/website/src/components/icon.js
index a5ccf1bde..322337955 100644
--- a/website/src/components/icon.js
+++ b/website/src/components/icon.js
@@ -23,6 +23,7 @@ import { ReactComponent as MoonIcon } from '../images/icons/moon.svg'
 import { ReactComponent as ClipboardIcon } from '../images/icons/clipboard.svg'
 import { ReactComponent as NetworkIcon } from '../images/icons/network.svg'
 import { ReactComponent as DownloadIcon } from '../images/icons/download.svg'
+import { ReactComponent as PackageIcon } from '../images/icons/package.svg'
 
 import classes from '../styles/icon.module.sass'
 
@@ -49,6 +50,7 @@ const icons = {
     clipboard: ClipboardIcon,
     network: NetworkIcon,
     download: DownloadIcon,
+    package: PackageIcon,
 }
 
 export default function Icon({ name, width = 20, height, inline = false, variant, className }) {
diff --git a/website/src/components/infobox.js b/website/src/components/infobox.js
index 046384986..363638bf2 100644
--- a/website/src/components/infobox.js
+++ b/website/src/components/infobox.js
@@ -5,8 +5,17 @@ import classNames from 'classnames'
 import Icon from './icon'
 import classes from '../styles/infobox.module.sass'
 
-export default function Infobox({ title, emoji, id, variant = 'default', className, children }) {
+export default function Infobox({
+    title,
+    emoji,
+    id,
+    variant = 'default',
+    list = false,
+    className,
+    children,
+}) {
     const infoboxClassNames = classNames(classes.root, className, {
+        [classes.list]: !!list,
         [classes.warning]: variant === 'warning',
         [classes.danger]: variant === 'danger',
     })
diff --git a/website/src/components/link.js b/website/src/components/link.js
index 34df20554..3644479c5 100644
--- a/website/src/components/link.js
+++ b/website/src/components/link.js
@@ -8,13 +8,21 @@ import Icon from './icon'
 import classes from '../styles/link.module.sass'
 import { isString } from './util'
 
-const internalRegex = /(http(s?)):\/\/(prodi.gy|spacy.io|irl.spacy.io)/gi
+const internalRegex = /(http(s?)):\/\/(prodi.gy|spacy.io|irl.spacy.io|explosion.ai|course.spacy.io)/gi
 
 const Whitespace = ({ children }) => (
     // Ensure that links are always wrapped in spaces
     <> {children} </>
 )
 
+function getIcon(dest) {
+    if (/(github.com)/.test(dest)) return 'code'
+    if (/^\/?api\/architectures#/.test(dest)) return 'network'
+    if (/^\/?api/.test(dest)) return 'docs'
+    if (/^\/?models\/(.+)/.test(dest)) return 'package'
+    return null
+}
+
 export default function Link({
     children,
     to,
@@ -30,22 +38,19 @@ export default function Link({
 }) {
     const dest = to || href
     const external = forceExternal || /(http(s?)):\/\//gi.test(dest)
-    const isApi = !external && !hidden && !hideIcon && /^\/?api/.test(dest)
-    const isArch = !external && !hidden && !hideIcon && /^\/?api\/architectures#/.test(dest)
-    const isSource = external && !hidden && !hideIcon && /(github.com)/.test(dest)
-    const withIcon = isApi || isArch || isSource
+    const icon = getIcon(dest)
+    const withIcon = !hidden && !hideIcon && !!icon
     const sourceWithText = withIcon && isString(children)
     const linkClassNames = classNames(classes.root, className, {
         [classes.hidden]: hidden,
-        [classes.nowrap]: (withIcon && !sourceWithText) || isArch,
+        [classes.nowrap]: (withIcon && !sourceWithText) || icon === 'network',
         [classes.withIcon]: withIcon,
     })
     const Wrapper = ws ? Whitespace : Fragment
-    const icon = isArch ? 'network' : isApi ? 'docs' : isSource ? 'code' : null
     const content = (
         <>
             {sourceWithText ? <span className={classes.sourceText}>{children}</span> : children}
-            {icon && <Icon name={icon} width={16} inline className={classes.icon} />}
+            {withIcon && <Icon name={icon} width={16} inline className={classes.icon} />}
         </>
     )
 
diff --git a/website/src/images/icons/package.svg b/website/src/images/icons/package.svg
new file mode 100644
index 000000000..4edaf4e6f
--- /dev/null
+++ b/website/src/images/icons/package.svg
@@ -0,0 +1,5 @@
+<svg xmlns="http://www.w3.org/2000/svg" width="24" height="24" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2">
+    <path fill="none" d="M21 16V8a2 2 0 0 0-1-1.73l-7-4a2 2 0 0 0-2 0l-7 4A2 2 0 0 0 3 8v8a2 2 0 0 0 1 1.73l7 4a2 2 0 0 0 2 0l7-4A2 2 0 0 0 21 16z"></path>
+    <polyline fill="none" points="3.27 6.96 12 12.01 20.73 6.96"></polyline>
+    <line fill="none" x1="12" y1="22.08" x2="12" y2="12"></line>
+</svg>
diff --git a/website/src/styles/infobox.module.sass b/website/src/styles/infobox.module.sass
index baf9919c3..8d6071f18 100644
--- a/website/src/styles/infobox.module.sass
+++ b/website/src/styles/infobox.module.sass
@@ -14,6 +14,21 @@
         font-size: inherit
         line-height: inherit
 
+    ul li
+        padding-left: 0.75em
+
+.list ul li
+    font-size: var(--font-size-sm)
+    list-style: none
+    padding: 0
+    margin: 0 0 0.35rem 0
+
+    &:before
+        all: initial
+
+    a, a span
+        border-bottom: 0 !important
+
 .title
     font-weight: bold
     color: var(--color-theme)