From 158d8c1e48961f8c962df01f72e5818f3ec2651d Mon Sep 17 00:00:00 2001
From: Ines Montani <ines@ines.io>
Date: Wed, 29 Jul 2020 18:44:10 +0200
Subject: [PATCH] Update docs [ci skip]

---
 website/docs/api/architectures.md            |   2 +
 website/docs/api/top-level.md                |  25 ++
 website/docs/api/transformer.md              |  82 +++++-
 website/docs/images/pipeline_transformer.svg |  37 +++
 website/docs/usage/transformers.md           | 294 +++++++++++++------
 5 files changed, 347 insertions(+), 93 deletions(-)
 create mode 100644 website/docs/images/pipeline_transformer.svg

diff --git a/website/docs/api/architectures.md b/website/docs/api/architectures.md
index a87c2a1e8..43387b8ca 100644
--- a/website/docs/api/architectures.md
+++ b/website/docs/api/architectures.md
@@ -26,6 +26,8 @@ TODO: intro and how architectures work, link to
 
 ### spacy-transformers.TransformerModel.v1 {#TransformerModel}
 
+### spacy-transformers.Tok2VecListener.v1 {#spacy-transformers.Tok2VecListener.v1}
+
 ## Parser & NER architectures {#parser source="spacy/ml/models/parser.py"}
 
 ### spacy.TransitionBasedParser.v1 {#TransitionBasedParser}
diff --git a/website/docs/api/top-level.md b/website/docs/api/top-level.md
index a463441c7..ede7f9e21 100644
--- a/website/docs/api/top-level.md
+++ b/website/docs/api/top-level.md
@@ -304,6 +304,31 @@ factories.
 | `losses`          | Registry for functions that create [losses](https://thinc.ai/docs/api-loss).                                                                                                                                                                      |
 | `initializers`    | Registry for functions that create [initializers](https://thinc.ai/docs/api-initializers).                                                                                                                                                        |
 
+### spacy-transformers registry {#registry-transformers}
+
+The following registries are added by the
+[`spacy-transformers`](https://github.com/explosion/spacy-transformers) package.
+See the [`Transformer`](/api/transformer) API reference and
+[usage docs](/usage/transformers) for details.
+
+> #### Example
+>
+> ```python
+> import spacy_transformers
+>
+> @spacy_transformers.registry.annotation_setters("my_annotation_setter.v1")
+> def configure_custom_annotation_setter():
+>     def annotation_setter(docs, trf_data) -> None:
+>        # Set annotations on the docs
+>
+>     return annotation_sette
+> ```
+
+| Registry name                                                | Description                                                                                                                                                                                                                                       |
+| ------------------------------------------------------------ | ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
+| [`span_getters`](/api/transformer#span_getters)              | Registry for functions that take a batch of `Doc` objects and return a list of `Span` objects to process by the transformer, e.g. sentences.                                                                                                      |
+| [`annotation_setters`](/api/transformers#annotation_setters) | Registry for functions that create annotation setters. Annotation setters are functions that take a batch of `Doc` objects and a [`FullTransformerBatch`](/api/transformer#fulltransformerbatch) and can set additional annotations on the `Doc`. |
+
 ## Training data and alignment {#gold source="spacy/gold"}
 
 ### gold.docs_to_json {#docs_to_json tag="function"}
diff --git a/website/docs/api/transformer.md b/website/docs/api/transformer.md
index e89ecb6b7..386f65a0a 100644
--- a/website/docs/api/transformer.md
+++ b/website/docs/api/transformer.md
@@ -31,8 +31,10 @@ attributes. We also calculate an alignment between the word-piece tokens and the
 spaCy tokenization, so that we can use the last hidden states to set the
 `Doc.tensor` attribute. When multiple word-piece tokens align to the same spaCy
 token, the spaCy token receives the sum of their values. To access the values,
-you can use the custom [`Doc._.trf_data`](#custom-attributes) attribute. For
-more details, see the [usage documentation](/usage/transformers).
+you can use the custom [`Doc._.trf_data`](#custom-attributes) attribute. The
+package also adds the function registries [`@span_getters`](#span_getters) and
+[`@annotation_setters`](#annotation_setters) with several built-in registered
+functions. For more details, see the [usage documentation](/usage/transformers).
 
 ## Config and implementation {#config}
 
@@ -51,11 +53,11 @@ architectures and their arguments and hyperparameters.
 > nlp.add_pipe("transformer", config=DEFAULT_CONFIG)
 > ```
 
-| Setting             | Type                                       | Description                                                                                                                                         | Default                                                 |
-| ------------------- | ------------------------------------------ | --------------------------------------------------------------------------------------------------------------------------------------------------- | ------------------------------------------------------- |
-| `max_batch_items`   | int                                        | Maximum size of a padded batch.                                                                                                                     | `4096`                                                  |
-| `annotation_setter` | Callable                                   | Function that takes a batch of `Doc` objects and a [`FullTransformerBatch`](#fulltransformerbatch) and can set additional annotations on the `Doc`. | `null_annotation_setter`                                |
-| `model`             | [`Model`](https://thinc.ai/docs/api-model) | The model to use.                                                                                                                                   | [TransformerModel](/api/architectures#TransformerModel) |
+| Setting             | Type                                       | Description                                                                                                                                                         | Default                                                 |
+| ------------------- | ------------------------------------------ | ------------------------------------------------------------------------------------------------------------------------------------------------------------------- | ------------------------------------------------------- |
+| `max_batch_items`   | int                                        | Maximum size of a padded batch.                                                                                                                                     | `4096`                                                  |
+| `annotation_setter` | Callable                                   | Function that takes a batch of `Doc` objects and a [`FullTransformerBatch`](/api/transformer#fulltransformerbatch) and can set additional annotations on the `Doc`. | `null_annotation_setter`                                |
+| `model`             | [`Model`](https://thinc.ai/docs/api-model) | The model to use.                                                                                                                                                   | [TransformerModel](/api/architectures#TransformerModel) |
 
 ```python
 https://github.com/explosion/spacy-transformers/blob/master/spacy_transformers/pipeline_component.py
@@ -390,6 +392,72 @@ Split a `TransformerData` object that represents a batch into a list with one
 | ----------- | ----------------------- | -------------- |
 | **RETURNS** | `List[TransformerData]` | <!-- TODO: --> |
 
+## Span getters {#span_getters tag="registered functions" source="github.com/explosion/spacy-transformers/blob/master/spacy_transformers/span_getters.py"}
+
+Span getters are functions that take a batch of [`Doc`](/api/doc) objects and
+return a lists of [`Span`](/api/span) objects for each doc, to be processed by
+the transformer. The returned spans can overlap.
+
+<!-- TODO: details on what this is for --> Span getters can be referenced in the
+
+config's `[components.transformer.model.get_spans]` block to customize the
+sequences processed by the transformer. You can also register custom span
+getters using the `@registry.span_getters` decorator.
+
+> #### Example
+>
+> ```python
+> @registry.span_getters("sent_spans.v1")
+> def configure_get_sent_spans() -> Callable:
+>     def get_sent_spans(docs: Iterable[Doc]) -> List[List[Span]]:
+>         return [list(doc.sents) for doc in docs]
+>
+>     return get_sent_spans
+> ```
+
+| Name        | Type               | Description                                                  |
+| ----------- | ------------------ | ------------------------------------------------------------ |
+| `docs`      | `Iterable[Doc]`    | A batch of `Doc` objects.                                    |
+| **RETURNS** | `List[List[Span]]` | The spans to process by the transformer, one list per `Doc`. |
+
+The following built-in functions are available:
+
+| Name               | Description                                                        |
+| ------------------ | ------------------------------------------------------------------ |
+| `doc_spans.v1`     | Create a span for each doc (no transformation, process each text). |
+| `sent_spans.v1`    | Create a span for each sentence if sentence boundaries are set.    |
+| `strided_spans.v1` | <!-- TODO: -->                                                     |
+
+## Annotation setters {#annotation_setters tag="registered functions" source="github.com/explosion/spacy-transformers/blob/master/spacy_transformers/annotation_setters.py"}
+
+Annotation setters are functions that that take a batch of `Doc` objects and a
+[`FullTransformerBatch`](/api/transformer#fulltransformerbatch) and can set
+additional annotations on the `Doc`, e.g. to set custom or built-in attributes.
+You can register custom annotation setters using the
+`@registry.annotation_setters` decorator.
+
+> #### Example
+>
+> ```python
+> @registry.annotation_setters("spacy-transformer.null_annotation_setter.v1")
+> def configure_null_annotation_setter() -> Callable:
+>     def setter(docs: List[Doc], trf_data: FullTransformerBatch) -> None:
+>         pass
+>
+>     return setter
+> ```
+
+| Name       | Type                   | Description                          |
+| ---------- | ---------------------- | ------------------------------------ |
+| `docs`     | `List[Doc]`            | A batch of `Doc` objects.            |
+| `trf_data` | `FullTransformerBatch` | The transformers data for the batch. |
+
+The following built-in functions are available:
+
+| Name                                          | Description                           |
+| --------------------------------------------- | ------------------------------------- |
+| `spacy-transformer.null_annotation_setter.v1` | Don't set any additional annotations. |
+
 ## Custom attributes {#custom-attributes}
 
 The component sets the following
diff --git a/website/docs/images/pipeline_transformer.svg b/website/docs/images/pipeline_transformer.svg
new file mode 100644
index 000000000..cfbf470cc
--- /dev/null
+++ b/website/docs/images/pipeline_transformer.svg
@@ -0,0 +1,37 @@
+<svg xmlns="http://www.w3.org/2000/svg" xmlns:xlink="http://www.w3.org/1999/xlink" width="1155" height="221" viewBox="0 0 1155 221">
+  <defs>
+    <rect id="a" width="735" height="170" x="210" y="25" rx="30"/>
+    <path id="c" d="M395 75h174l23.5 43.4L569 160H395l23.5-41.5z"/>
+    <mask id="b" width="735" height="170" x="0" y="0" fill="#fff" maskContentUnits="userSpaceOnUse" maskUnits="objectBoundingBox">
+      <use xlink:href="#a"/>
+    </mask>
+  </defs>
+  <g fill="none" fill-rule="evenodd" transform="translate(0 26)">
+    <rect width="145" height="80" x="2.5" y="2.5" fill="#D8D8D8" stroke="#6A6A6A" stroke-width="5" rx="10" transform="translate(0 70)"/>
+    <path fill="#3D4251" fill-rule="nonzero" d="M55.4 99.7v3.9h-7.6V125H43v-21.4h-7.7v-3.9h20zm10.2 7c1 0 2.1.2 3 .6a6.8 6.8 0 014.1 4.1 9.6 9.6 0 01.6 4.3l-.2.5-.3.3H61.3c0 2 .6 3.3 1.4 4.1.9.9 2 1.3 3.5 1.3a6 6 0 001.8-.2l1.3-.6 1-.5.8-.3c.2 0 .3 0 .5.2l.3.2 1.3 1.6c-.5.6-1 1-1.6 1.4a9 9 0 01-3.9 1.4l-2 .2c-1.2 0-2.3-.2-3.4-.7-1-.4-2-1-2.8-1.8a8.6 8.6 0 01-1.9-3 11.6 11.6 0 010-7.6c.3-1.1.9-2 1.6-2.8a8 8 0 012.7-2 9 9 0 013.7-.6zm0 3.2a4 4 0 00-3 1c-.6.7-1 1.8-1.3 3h8.1c0-.5 0-1-.2-1.5-.1-.5-.4-1-.7-1.3-.3-.4-.7-.7-1.2-1a4 4 0 00-1.7-.2zm15.5 5.8l-5.9-8.7h4.2c.3 0 .5 0 .7.2l.4.4 3.7 6a4.9 4.9 0 01.6-1.2l3-4.7.4-.5.6-.2h4l-6 8.5L93 125h-4.2c-.3 0-.5 0-.7-.2l-.5-.6-3.8-6.3-.4 1.1-3.4 5.2-.5.5a1 1 0 01-.7.3H75l6-9.3zm20.5 9.6c-1.5 0-2.7-.5-3.5-1.3a5 5 0 01-1.3-3.7v-10H95c-.3 0-.5 0-.6-.2-.2-.2-.3-.4-.3-.7v-1.7l2.9-.5 1-5c0-.1 0-.3.2-.5l.7-.2h2.2v5.7h4.7v3h-4.7v9.8c0 .6.2 1 .4 1.3.3.3.7.5 1.2.5l.6-.1a3.7 3.7 0 00.9-.4l.3-.1.3.1.3.3 1.2 2c-.6.6-1.3 1-2.1 1.3a8 8 0 01-2.6.4z"/>
+    <rect width="145" height="80" x="2.5" y="2.5" fill="#D7CCF4" stroke="#8978B5" stroke-width="5" rx="10" transform="translate(1005 70)"/>
+    <path fill="#3D4251" fill-rule="nonzero" d="M1050.3 101.5a58.8 58.8 0 016.8-.4c2.2 0 4 .4 5.4 1 1.4.6 2.5 1.5 3.4 2.6a10 10 0 011.7 4 23.2 23.2 0 010 9.6c-.3 1.5-1 2.9-1.8 4-.8 1.3-2 2.2-3.5 3-1.5.7-3.4 1-5.8 1a37.3 37.3 0 01-5-.1l-1.2-.2v-24.5zm7 4a15.6 15.6 0 00-2.3 0V122h.5a158 158 0 001.6.1 6 6 0 003.2-.7c.8-.5 1.4-1.2 1.8-2 .4-.8.7-1.8.8-2.8a27.3 27.3 0 000-5.8 8 8 0 00-.7-2.6c-.4-.8-1-1.5-1.8-2-.7-.5-1.8-.8-3.1-.8zm13.4 11.8c0-1.5.2-2.8.7-4a8 8 0 014.8-4.7c1.1-.4 2.4-.6 3.8-.6 1.5 0 2.8.2 4 .7 1 .4 2 1 2.9 1.8.8.9 1.4 1.8 1.8 3 .4 1.1.6 2.4.6 3.7 0 1.5-.2 2.8-.7 4a8 8 0 01-4.8 4.7c-1.1.4-2.4.6-3.8.6a11 11 0 01-4-.7c-1-.4-2-1-2.9-1.8a7.9 7.9 0 01-1.8-3c-.4-1.1-.6-2.4-.6-3.8zm4.7 0c0 .7.1 1.4.3 2 .2.7.5 1.3 1 1.8a4.1 4.1 0 003.3 1.5c1.4 0 2.5-.4 3.3-1.3.9-.8 1.3-2.2 1.3-4a6 6 0 00-1.2-4c-.8-1-2-1.4-3.4-1.4-.7 0-1.3 0-1.8.3-.6.2-1 .5-1.5 1-.4.4-.7 1-1 1.6-.2.7-.3 1.5-.3 2.4zm34.2 7c-1 .7-2 1.3-3.3 1.6-1.3.4-2.7.6-4 .6-1.6 0-3-.2-4.1-.7-1.2-.4-2.2-1-3-1.8a8 8 0 01-1.8-3 10.9 10.9 0 010-7.7 8.2 8.2 0 015.2-4.7 14.3 14.3 0 017.6-.2l2.6 1v6.1h-3.8v-3.2l-2.2-.3c-.7 0-1.3.1-2 .3a4.8 4.8 0 00-2.9 2.6c-.3.7-.5 1.4-.5 2.3 0 .8.2 1.5.4 2.1a5 5 0 002.8 2.8 8.2 8.2 0 005.6-.2l1.9-1 1.5 3.4z"/>
+    <use stroke="#3AC" stroke-dasharray="5 10" stroke-width="10" mask="url(#b)" xlink:href="#a"/>
+    <g transform="translate(540)">
+      <rect width="95" height="50" x="2.5" y="2.5" fill="#C3E7F1" stroke="#3AC" stroke-width="5" rx="10"/>
+      <path fill="#3D4251" fill-rule="nonzero" d="M27.8 24.5h4.4l.3 1.6h.1a5.2 5.2 0 014.2-2c.7 0 1.3.1 1.8.3.6.2 1 .4 1.4.8.4.4.7 1 1 1.6.1.6.3 1.5.3 2.4V37H38v-7.1c0-1-.2-1.8-.7-2.2-.4-.5-1-.7-1.7-.7-.6 0-1.2.2-1.7.6-.5.3-.9.8-1 1.3V37h-3.3v-9.8h-1.8v-2.7zm16.9-5H50v11.6c0 1.2.2 2.1.5 2.6s.8.8 1.5.8c.5 0 1 0 1.3-.2l1-.4 1.2 2.2a15.3 15.3 0 01-1.8 1 6.1 6.1 0 01-2.3.3c-1.5 0-2.7-.4-3.5-1.3-.8-.8-1.1-1.9-1.1-3.4V22.3h-2.1v-2.7zm12.8 5h4.3L62 26h.1c.9-1.2 2.3-1.9 4.2-1.9a6 6 0 012.1.4c.7.3 1.2.6 1.7 1.1.4.6.8 1.2 1 2 .3.8.4 1.7.4 2.8 0 1-.1 2-.4 3-.3.8-.7 1.5-1.2 2.1-.6.6-1.2 1-2 1.4-.7.3-1.6.5-2.6.5-.5 0-1 0-1.5-.2-.5 0-1-.2-1.3-.3V42h-3.2V27.2h-1.9v-2.7zm8 2.4c-.7 0-1.3.2-1.8.5s-.9.8-1 1.4V34c.2.2.5.3 1 .4l1.3.2c.4 0 .9 0 1.3-.2s.7-.4 1-.8c.3-.4.6-.8.7-1.3.2-.6.3-1.2.3-2 0-1-.3-1.9-.8-2.5-.6-.6-1.2-.9-2-.9z"/>
+    </g>
+    <path fill="#3AC" d="M205 112.5L180 125v-25z"/>
+    <path stroke="#3AC" stroke-linecap="square" stroke-width="5" d="M180 112.5h-23.1"/>
+    <path fill="#3AC" d="M1000 112.5L975 125v-25z"/>
+    <path stroke="#3AC" stroke-linecap="square" stroke-width="5" d="M975 112.5h-23.1"/>
+    <path fill="#EAC1CC" stroke="#F03969" stroke-linejoin="round" stroke-width="3.8" d="M230 75h135l23.5 43.4L365 160H230l23.5-41.5z"/>
+    <g stroke-linejoin="round">
+      <use fill="#F2D7B2" style="mix-blend-mode:color-burn" xlink:href="#c"/>
+      <use stroke="#F0A439" stroke-width="3.8" xlink:href="#c"/>
+    </g>
+    <path fill="#F2E7A6" stroke="#CDB217" stroke-linejoin="round" stroke-width="3.8" d="M563 75h100l23.5 43.4L663 160H563l23.5-41.5z"/>
+    <path fill="#D7E99A" stroke="#B2D73A" stroke-linejoin="round" stroke-width="3.8" d="M664 75h131l23.5 43.4L795 160H664l23.5-41.5z"/>
+    <path fill="#B5F3D4" stroke="#3AD787" stroke-linejoin="round" stroke-width="3.8" d="M790 75h110l23.5 43.4L900 160H790l23.5-41.5z"/>
+    <path fill="#3D4251" fill-rule="nonzero" d="M265.9 125.2c-1.1 0-2-.3-2.6-1-.6-.6-.9-1.4-.9-2.5v-7.2h-1.3c-.2 0-.3 0-.4-.2-.2 0-.2-.2-.2-.5v-1.2l2-.3.7-3.5.2-.4.5-.2h1.6v4h3.4v2.3h-3.4v7c0 .3 0 .6.3.9.2.2.5.3.8.3h.5a2.6 2.6 0 00.6-.3l.2-.1h.2l.2.3 1 1.5-1.6.8-1.8.3zm10.9-13.2c1 0 1.8.1 2.6.4a5.6 5.6 0 013.3 3.4c.3.8.4 1.8.4 2.8 0 1-.1 1.9-.4 2.7a5.5 5.5 0 01-3.3 3.4 7 7 0 01-2.6.5 7 7 0 01-2.6-.5 5.6 5.6 0 01-3.3-3.4 7.8 7.8 0 010-5.5c.3-.8.7-1.5 1.3-2 .5-.6 1.2-1 2-1.4a7 7 0 012.6-.4zm0 10.8c1 0 1.9-.3 2.4-1 .5-.8.7-1.8.7-3.2 0-1.4-.2-2.4-.7-3.2-.5-.7-1.3-1-2.4-1-1 0-1.9.3-2.4 1-.5.8-.8 1.8-.8 3.2 0 1.4.3 2.4.8 3.1.5.8 1.3 1.1 2.4 1.1zm11.9-16.4v10.7h.5l.5-.1.4-.3 3.2-4 .4-.4.7-.1h2.8l-4 4.7-.4.5-.5.4.4.4.4.6 4.3 6.2h-2.8l-.6-.1c-.2-.1-.3-.2-.4-.5l-3.3-4.8a1 1 0 00-.4-.4h-1.2v5.8h-3.1v-18.6h3zm16 5.6c.7 0 1.5.1 2.2.4a4.9 4.9 0 012.9 3 6.9 6.9 0 01.3 3v.3l-.3.2h-8.3c.1 1.4.5 2.4 1.1 3 .6.6 1.4.9 2.4.9.6 0 1 0 1.3-.2a22 22 0 001.7-.8l.6-.1h.3l.3.3.9 1c-.4.5-.8.8-1.2 1a6.4 6.4 0 01-2.7 1c-.5.2-1 .2-1.4.2-1 0-1.7-.2-2.5-.5s-1.4-.7-2-1.3c-.6-.5-1-1.3-1.4-2.1a8.3 8.3 0 010-5.5 5.7 5.7 0 013.2-3.4c.7-.3 1.6-.4 2.5-.4zm0 2.2c-1 0-1.6.2-2.1.8-.5.5-.9 1.2-1 2.1h5.8c0-.4 0-.8-.2-1.1 0-.4-.2-.7-.5-1l-.8-.6-1.2-.2zm8 10.8v-12.8h1.9c.4 0 .6.2.8.5l.2 1a7 7 0 011.7-1.2 4.6 4.6 0 012.2-.5c.7 0 1.4 0 1.9.3l1.4 1 .8 1.6c.2.6.3 1.2.3 2v8.1h-3.1v-8.2c0-.7-.2-1.4-.6-1.8-.3-.4-.9-.6-1.6-.6l-1.5.3c-.5.3-1 .6-1.3 1v9.3h-3.1zm17.5-12.8V125H327v-12.8h3zm.4-3.8l-.1.8a2 2 0 01-1 1 2 2 0 01-2.2-.4 2 2 0 01-.4-.6l-.2-.8a2 2 0 01.6-1.4 2 2 0 011.3-.5l.8.1a2 2 0 011 1l.3.8zm12.3 5v.7l-.3.5-6.2 8h6.4v2.4h-10v-1.3l.2-.5c0-.2.1-.4.3-.5l6.1-8.2h-6.2v-2.3h9.8v1.3zm7.8-1.4c.8 0 1.6.1 2.2.4a4.9 4.9 0 013 3 6.9 6.9 0 01.3 3v.3l-.3.2h-8.3c.1 1.4.5 2.4 1 3 .7.6 1.5.9 2.5.9.5 0 1 0 1.3-.2a22 22 0 001.7-.8l.6-.1h.3l.3.3.8 1c-.3.5-.7.8-1.1 1a6.4 6.4 0 01-2.7 1c-.5.2-1 .2-1.4.2-1 0-1.8-.2-2.5-.5-.8-.3-1.5-.7-2-1.3-.6-.5-1-1.3-1.4-2.1a8.3 8.3 0 010-5.5 5.7 5.7 0 013.2-3.4c.7-.3 1.6-.4 2.5-.4zm0 2.2c-.8 0-1.5.2-2 .8-.5.5-.9 1.2-1 2.1h5.8c0-.4 0-.8-.2-1.1 0-.4-.2-.7-.5-1l-.8-.6-1.2-.2zm8 10.8v-12.8h1.9l.6.1c.2.2.3.4.3.7l.2 1.5a6 6 0 011.6-1.9c.6-.4 1.3-.7 2-.7s1.2.2 1.6.5l-.3 2.3-.2.3-.3.1h-.6l-.8-.2c-.7 0-1.2.2-1.7.6a4 4 0 00-1.1 1.5v8h-3.1z"/>
+    <path fill="#3D4251" fill-rule="nonzero" d="M436.9 125.2c-1.1 0-2-.3-2.6-1-.6-.6-.9-1.4-.9-2.5v-7.2h-1.3c-.2 0-.3 0-.4-.2-.2 0-.2-.2-.2-.5v-1.2l2-.3.7-3.5.2-.4.5-.2h1.6v4h3.4v2.3h-3.4v7c0 .3 0 .6.3.9.2.2.5.3.8.3h.5a2.6 2.6 0 00.6-.3l.2-.1h.2l.2.3 1 1.5-1.6.8-1.8.3zm5.4-.2v-12.8h1.8l.7.1.3.7.1 1.5a6 6 0 011.7-1.9c.6-.4 1.3-.7 2-.7s1.2.2 1.6.5l-.4 2.3c0 .1 0 .2-.2.3l-.3.1h-.5l-.9-.2c-.6 0-1.2.2-1.6.6a4 4 0 00-1.2 1.5v8h-3zm20.3 0h-1.4l-.7-.1c-.2-.1-.3-.3-.4-.6l-.2-.9a10.6 10.6 0 01-2 1.3 5 5 0 01-1 .4 6.4 6.4 0 01-2.8-.1l-1.2-.7a3 3 0 01-.7-1c-.2-.5-.3-1-.3-1.6 0-.5.1-1 .4-1.4.2-.5.7-.9 1.3-1.3.5-.4 1.3-.7 2.3-1 1-.2 2.2-.3 3.7-.3v-.8c0-.9-.2-1.5-.6-2-.3-.3-.9-.5-1.6-.5a3.8 3.8 0 00-2 .5l-.8.4c-.2.2-.4.2-.6.2-.2 0-.4 0-.6-.2l-.3-.3-.6-1c1.5-1.4 3.3-2 5.3-2 .8 0 1.5 0 2 .3a4.3 4.3 0 012.5 2.6c.2.6.3 1.3.3 2v8.1zm-6-2h.9a3.3 3.3 0 001.4-.7l.7-.6v-2.2c-1 0-1.7.1-2.3.3a6 6 0 00-1.4.4l-.8.6c-.2.2-.3.5-.3.8 0 .5.2.9.5 1.1.4.3.8.4 1.3.4zm9 2v-12.8h2c.3 0 .6.2.7.5l.2 1a7 7 0 011.7-1.2 4.6 4.6 0 012.3-.5c.7 0 1.3 0 1.8.3.6.3 1 .6 1.4 1 .4.5.6 1 .8 1.6.2.6.3 1.2.3 2v8.1h-3v-8.2c0-.7-.3-1.4-.6-1.8-.4-.4-1-.6-1.7-.6l-1.5.3-1.3 1v9.3h-3zm21.8-10.3l-.2.3h-.8a32.9 32.9 0 00-1.4-.7h-1c-.6 0-1.1 0-1.5.3-.3.3-.5.6-.5 1 0 .3.1.5.3.7l.7.5 1 .4a33 33 0 012.3.8l1 .7c.3.2.5.5.7 1 .2.3.3.7.3 1.2 0 .7-.1 1.2-.4 1.7-.2.6-.5 1-1 1.4-.4.4-1 .7-1.5.9a7 7 0 01-3.5.2 7.6 7.6 0 01-2.3-.8l-.9-.7.7-1.1.4-.4h1a12 12 0 001.4.8l1.1.1h1l.6-.4.4-.5.1-.6c0-.3 0-.6-.3-.8l-.7-.5-1-.3a33.5 33.5 0 01-2.3-.9 4 4 0 01-1-.7 3 3 0 01-.7-1 3.7 3.7 0 011-4.2c.3-.3.8-.6 1.4-.8.6-.2 1.3-.3 2.1-.3 1 0 1.7.1 2.4.4.8.3 1.4.7 1.8 1.2l-.7 1zm4 10.3v-10.5l-1.2-.2-.6-.2a.7.7 0 01-.2-.5v-1.3h2v-1c0-.7 0-1.4.2-2a4.1 4.1 0 012.5-2.4 5.8 5.8 0 013.6 0v1.5c0 .2-.1.4-.3.4l-.8.1c-.3 0-.7 0-1 .2a1.7 1.7 0 00-1.1 1.1 4 4 0 00-.2 1.2v.9h3.3v2.2h-3.2V125h-3zm13.6-13c1 0 1.8.1 2.6.4a5.6 5.6 0 013.3 3.4c.3.8.4 1.8.4 2.8 0 1-.1 1.9-.4 2.7a5.5 5.5 0 01-3.3 3.4 7 7 0 01-2.6.5 7 7 0 01-2.6-.5 5.6 5.6 0 01-3.3-3.4 7.8 7.8 0 010-5.5c.3-.8.7-1.5 1.3-2 .5-.6 1.2-1 2-1.4a7 7 0 012.6-.4zm0 10.8c1 0 1.9-.3 2.4-1 .5-.8.7-1.8.7-3.2 0-1.4-.2-2.4-.7-3.2-.5-.7-1.3-1-2.4-1-1 0-1.9.3-2.4 1-.5.8-.8 1.8-.8 3.2 0 1.4.3 2.4.8 3.1.5.8 1.3 1.1 2.4 1.1zm8.7 2.2v-12.8h1.8l.7.1.3.7.1 1.5a6 6 0 011.7-1.9c.6-.4 1.3-.7 2-.7s1.2.2 1.6.5l-.4 2.3-.1.3-.4.1h-.5l-.9-.2c-.6 0-1.2.2-1.6.6a4 4 0 00-1.2 1.5v8h-3zm10.3 0v-12.8h1.8c.4 0 .7.2.8.5l.2 1 .7-.7a4.5 4.5 0 011.7-.9 4 4 0 011-.1 3 3 0 012 .6c.6.5 1 1 1.2 1.8a4 4 0 011.8-1.9l1.1-.4a5.5 5.5 0 013.1.2c.6.2 1 .5 1.4 1 .4.4.7.9.9 1.5.2.6.3 1.3.3 2v8.2h-3.1v-8.2c0-.8-.2-1.4-.6-1.8-.3-.4-.8-.6-1.5-.6l-1 .1a2.1 2.1 0 00-1.1 1.3 3 3 0 00-.2 1v8.2h-3v-8.2c0-.8-.3-1.4-.6-1.8-.4-.4-.9-.6-1.5-.6-.5 0-.9 0-1.3.3-.4.2-.7.5-1 1v9.3H524zm26.3-13c.8 0 1.6.1 2.2.4a4.9 4.9 0 013 3 6.9 6.9 0 01.3 3v.3l-.3.2h-8.3c.1 1.4.5 2.4 1 3 .7.6 1.5.9 2.5.9.5 0 1 0 1.3-.2a22 22 0 001.7-.8l.6-.1h.3l.2.3 1 1c-.4.5-.8.8-1.2 1a6.4 6.4 0 01-2.8 1c-.4.2-.9.2-1.3.2-1 0-1.8-.2-2.5-.5-.8-.3-1.5-.7-2-1.3-.6-.5-1-1.3-1.4-2.1a8.3 8.3 0 010-5.5 5.7 5.7 0 013.2-3.4c.7-.3 1.5-.4 2.5-.4zm0 2.2c-.8 0-1.5.2-2 .8-.6.5-.9 1.2-1 2.1h5.8c0-.4 0-.8-.2-1.1l-.5-1-.8-.6-1.2-.2zm8 10.8v-12.8h1.9l.6.1c.2.2.2.4.3.7l.2 1.5a6 6 0 011.6-1.9c.6-.4 1.3-.7 2-.7s1.2.2 1.6.5l-.4 2.3-.1.3-.4.1h-.5l-.8-.2c-.7 0-1.2.2-1.7.6a4 4 0 00-1.1 1.5v8h-3.1z"/>
+    <path fill="#3D4251" fill-rule="nonzero" d="M610.6 125v-12.8h2c.3 0 .6.2.7.5l.2 1a7 7 0 011.8-1.2 4.6 4.6 0 012.2-.5c.7 0 1.3 0 1.9.3.5.3 1 .6 1.3 1 .4.5.7 1 .8 1.6.2.6.3 1.2.3 2v8.1h-3v-8.2c0-.7-.2-1.4-.6-1.8-.4-.4-1-.6-1.6-.6-.6 0-1 .1-1.5.3l-1.4 1v9.3h-3zm19.6-13c.8 0 1.5.1 2.2.4a4.9 4.9 0 012.9 3 6.9 6.9 0 01.4 3l-.1.3-.3.2H627c.2 1.4.5 2.4 1.1 3 .7.6 1.5.9 2.5.9.5 0 1 0 1.3-.2a22 22 0 001.7-.8l.5-.1h.4l.2.3.9 1c-.3.5-.7.8-1.1 1a6.4 6.4 0 01-2.8 1c-.5.2-1 .2-1.4.2-.9 0-1.7-.2-2.5-.5-.7-.3-1.4-.7-2-1.3-.5-.5-1-1.3-1.3-2.1a8.3 8.3 0 010-5.5 5.7 5.7 0 013.2-3.4c.6-.3 1.5-.4 2.5-.4zm0 2.2c-.9 0-1.6.2-2 .8-.6.5-1 1.2-1 2.1h5.7l-.1-1.1-.5-1-.9-.6-1.2-.2zm8 10.8v-12.8h1.8l.7.1.3.7.1 1.5a6 6 0 011.6-1.9c.7-.4 1.4-.7 2.1-.7.7 0 1.2.2 1.6.5l-.4 2.3c0 .1 0 .2-.2.3l-.3.1h-.5l-.9-.2c-.6 0-1.2.2-1.6.6a4 4 0 00-1.2 1.5v8h-3z"/>
+    <path fill="#3D4251" fill-rule="nonzero" d="M708.9 125.2c-1.1 0-2-.3-2.6-1-.6-.6-.9-1.4-.9-2.5v-7.2h-1.3c-.2 0-.3 0-.4-.2-.2 0-.2-.2-.2-.5v-1.2l2-.3.7-3.5.2-.4.5-.2h1.6v4h3.4v2.3h-3.4v7c0 .3 0 .6.3.9.2.2.5.3.8.3h.5a2.6 2.6 0 00.6-.3l.2-.1h.2l.2.3 1 1.5-1.6.8-1.8.3zm10.7-13.2c.8 0 1.6.1 2.3.4a4.9 4.9 0 012.9 3 6.9 6.9 0 01.3 3v.3l-.3.2h-8.3c.1 1.4.5 2.4 1.1 3 .6.6 1.4.9 2.4.9.6 0 1 0 1.3-.2a22 22 0 001.7-.8l.6-.1h.3l.3.3.9 1c-.4.5-.8.8-1.2 1a6.4 6.4 0 01-2.7 1c-.5.2-1 .2-1.4.2-1 0-1.7-.2-2.5-.5s-1.4-.7-2-1.3c-.6-.5-1-1.3-1.3-2.1a8.3 8.3 0 01-.1-5.5 5.7 5.7 0 013.2-3.4c.7-.3 1.6-.4 2.5-.4zm0 2.2c-.8 0-1.5.2-2 .8-.5.5-.9 1.2-1 2.1h5.8c0-.4 0-.8-.2-1.1 0-.4-.2-.7-.5-1l-.8-.6-1.2-.2zm11.1 4.2l-4.2-6.2h3.5l.3.4 2.7 4.3a3.5 3.5 0 01.4-.9l2.1-3.4.3-.3.4-.1h2.9l-4.3 6 4.4 6.8h-3c-.2 0-.3 0-.5-.2l-.3-.4-2.8-4.4c0 .3-.1.5-.3.7l-2.4 3.7c0 .2-.2.3-.3.4l-.5.2h-2.8l4.4-6.6zm14.7 6.8c-1.1 0-2-.3-2.6-1-.6-.6-.9-1.4-.9-2.5v-7.2h-1.3c-.1 0-.3 0-.4-.2-.1 0-.2-.2-.2-.5v-1.2l2-.3.7-3.5c0-.2.1-.3.3-.4l.4-.2h1.6v4h3.4v2.3H745v7l.3.9c.2.2.5.3.8.3h.5a2.6 2.6 0 00.6-.3l.2-.1h.3l.1.3 1 1.5-1.6.8-1.8.3zm14.5-10.3l-.3.3h-.8a14.9 14.9 0 00-1.3-.7l-1-.2c-.6 0-1.1.1-1.5.3-.4.2-.8.5-1 .9-.3.3-.5.8-.6 1.3-.2.5-.2 1.1-.2 1.8 0 .6 0 1.3.2 1.8.1.5.3 1 .6 1.3.3.4.6.7 1 .9l1.3.2a3.3 3.3 0 002-.5l.5-.4c.2-.2.4-.2.6-.2.2 0 .4 0 .5.3l1 1a5.6 5.6 0 01-2.4 1.7l-1.4.4h-1.3c-.8 0-1.6 0-2.3-.4-.7-.3-1.3-.7-1.8-1.3-.5-.5-1-1.2-1.2-2a8 8 0 01-.5-2.8c0-1 .1-1.9.4-2.7a5.5 5.5 0 013.1-3.5c.8-.3 1.7-.4 2.7-.4 1 0 1.8.1 2.5.4.8.3 1.4.8 2 1.4l-.8 1zm13 10.1h-1.4l-.7-.1c-.2-.1-.3-.3-.4-.6l-.3-.9a10.6 10.6 0 01-2 1.3 5 5 0 01-1 .4 6.4 6.4 0 01-2.7-.1c-.5-.2-.9-.4-1.2-.7a3 3 0 01-.8-1c-.2-.5-.3-1-.3-1.6 0-.5.2-1 .4-1.4.3-.5.7-.9 1.3-1.3.6-.4 1.4-.7 2.4-1 1-.2 2.2-.3 3.6-.3v-.8c0-.9-.2-1.5-.5-2-.4-.3-1-.5-1.6-.5a3.8 3.8 0 00-2.1.5l-.7.4c-.2.2-.4.2-.7.2-.2 0-.4 0-.5-.2-.2 0-.3-.2-.4-.3l-.5-1c1.4-1.4 3.2-2 5.3-2a4.3 4.3 0 014.4 3c.2.5.3 1.2.3 1.9v8.1zm-6-2h.8a3.3 3.3 0 001.5-.7l.6-.6v-2.2c-.9 0-1.6.1-2.2.3a6 6 0 00-1.5.4l-.8.6-.2.8c0 .5.2.9.5 1.1.3.3.7.4 1.2.4zm13.2 2.2c-1.1 0-2-.3-2.6-1-.6-.6-.9-1.4-.9-2.5v-7.2h-1.3c-.1 0-.3 0-.4-.2-.1 0-.2-.2-.2-.5v-1.2l2-.3.7-3.5c0-.2.1-.3.3-.4l.4-.2h1.6v4h3.4v2.3h-3.4v7l.3.9c.2.2.5.3.8.3h.5a2.6 2.6 0 00.6-.3l.2-.1h.2l.2.3 1 1.5-1.6.8-1.8.3z"/>
+    <path fill="#3D4251" fill-rule="nonzero" d="M855 123.3a2 2 0 01.5-1.3 2 2 0 011.3-.6 1.9 1.9 0 011.4.6 1.9 1.9 0 01.3 2 1.8 1.8 0 01-1 1 2 2 0 01-2-.4c-.2-.1-.3-.3-.4-.6a2 2 0 01-.2-.7zm5.5 0a2 2 0 01.6-1.3 2 2 0 011.3-.6 1.9 1.9 0 011.4.6 1.9 1.9 0 01.4 2 1.8 1.8 0 01-1 1 2 2 0 01-2-.4c-.3-.1-.4-.3-.5-.6a2 2 0 01-.2-.7zm5.7 0a2 2 0 01.5-1.3 2 2 0 011.4-.6 1.9 1.9 0 011.3.6 1.9 1.9 0 01.4 2 1.8 1.8 0 01-1 1 2 2 0 01-2-.4c-.3-.1-.4-.3-.5-.6a2 2 0 01-.1-.7z"/>
+  </g>
+</svg>
diff --git a/website/docs/usage/transformers.md b/website/docs/usage/transformers.md
index d5ce4e891..791eaac37 100644
--- a/website/docs/usage/transformers.md
+++ b/website/docs/usage/transformers.md
@@ -1,10 +1,17 @@
 ---
 title: Transformers
 teaser: Using transformer models like BERT in spaCy
+menu:
+  - ['Installation', 'install']
+  - ['Runtime Usage', 'runtime']
+  - ['Training Usage', 'training']
 ---
 
+## Installation {#install hidden="true"}
+
 spaCy v3.0 lets you use almost **any statistical model** to power your pipeline.
-You can use models implemented in a variety of frameworks, including TensorFlow,
+You can use models implemented in a variety of
+[frameworks](https://thinc.ai/docs/usage-frameworks), including TensorFlow,
 PyTorch and MXNet. To keep things sane, spaCy expects models from these
 frameworks to be wrapped with a common interface, using our machine learning
 library [Thinc](https://thinc.ai). A transformer model is just a statistical
@@ -15,34 +22,110 @@ that do the required plumbing. We also provide a pipeline component,
 [`Transformer`](/api/transformer), that lets you do multi-task learning and lets
 you save the transformer outputs for later use.
 
-<Project id="en_core_bert">
+To use transformers with spaCy, you need the
+[`spacy-transformers`](https://github.com/explosion/spacy-transformers) package
+installed. It takes care of all the setup behind the scenes, and makes sure the
+transformer pipeline component is available to spaCy.
 
-Try out a BERT-based model pipeline using this project template: swap in your
-data, edit the settings and hyperparameters and train, evaluate, package and
-visualize your model.
+```bash
+$ pip install spacy-transformers
+```
 
-</Project>
+<!-- TODO: the text below has been copied from the spacy-transformers repo and needs to be updated and adjusted -->
 
-<!-- TODO: the text below has been copied from the spacy-transformers repo and needs to be updated and adjusted
+## Runtime usage {#runtime}
 
-### Training usage
+Transformer models can be used as **drop-in replacements** for other types of
+neural networks, so your spaCy pipeline can include them in a way that's
+completely invisible to the user. Users will download, load and use the model in
+the standard way, like any other spaCy pipeline. Instead of using the
+transformers as subnetworks directly, you can also use them via the
+[`Transformer`](/api/transformer) pipeline component.
+
+![The processing pipeline with the transformer component](../images/pipeline_transformer.svg)
+
+The `Transformer` component sets the
+[`Doc._.trf_data`](/api/transformer#custom_attributes) extension attribute,
+which lets you access the transformers outputs at runtime.
+
+```bash
+$ python -m spacy download en_core_trf_lg
+```
+
+```python
+### Example
+import spacy
+
+nlp = spacy.load("en_core_trf_lg")
+for doc in nlp.pipe(["some text", "some other text"]):
+    tokvecs = doc._.trf_data.tensors[-1]
+```
+
+You can also customize how the [`Transformer`](/api/transformer) component sets
+annotations onto the [`Doc`](/api/doc), by customizing the `annotation_setter`.
+This callback will be called with the raw input and output data for the whole
+batch, along with the batch of `Doc` objects, allowing you to implement whatever
+you need. The annotation setter is called with a batch of [`Doc`](/api/doc)
+objects and a [`FullTransformerBatch`](/api/transformer#fulltransformerbatch)
+containing the transformers data for the batch.
+
+```python
+def custom_annotation_setter(docs, trf_data):
+    # TODO:
+    ...
+
+nlp = spacy.load("en_core_trf_lg")
+nlp.get_pipe("transformer").annotation_setter = custom_annotation_setter
+doc = nlp("This is a text")
+print()  # TODO:
+```
+
+## Training usage {#training}
 
 The recommended workflow for training is to use spaCy's
 [config system](/usage/training#config), usually via the
-[`spacy train`](/api/cli#train) command. The config system lets you describe a
-tree of objects by referring to creation functions, including functions you
-register yourself. Here's a config snippet for the `Transformer` component,
-along with matching Python code.
+[`spacy train`](/api/cli#train) command. The training config defines all
+component settings and hyperparameters in one place and lets you describe a tree
+of objects by referring to creation functions, including functions you register
+yourself.
+
+<Project id="en_core_bert">
+
+The easiest way to get started is to clone a transformers-based project
+template. Swap in your data, edit the settings and hyperparameters and train,
+evaluate, package and visualize your model.
+
+</Project>
+
+The `[components]` section in the [`config.cfg`](#TODO:) describes the pipeline
+components and the settings used to construct them, including their model
+implementation. Here's a config snippet for the
+[`Transformer`](/api/transformer) component, along with matching Python code:
+
+> #### Python equivalent
+>
+> ```python
+> from spacy_transformers import Transformer, TransformerModel
+> from spacy_transformers.annotation_setters import null_annotation_setter
+> from spacy_transformers.span_getters import get_doc_spans
+>
+> trf = Transformer(
+>     nlp.vocab,
+>     TransformerModel(
+>         "bert-base-cased",
+>         get_spans=get_doc_spans,
+>         tokenizer_config={"use_fast": True},
+>     ),
+>     annotation_setter=null_annotation_setter,
+>     max_batch_items=4096,
+> )
+> ```
 
 ```ini
-[nlp]
-lang = "en"
-pipeline = ["transformer"]
-
+### config.cfg (excerpt)
 [components.transformer]
 factory = "transformer"
-extra_annotation_setter = null
-max_batch_size = 32
+max_batch_items = 4096
 
 [components.transformer.model]
 @architectures = "spacy-transformers.TransformerModel.v1"
@@ -50,46 +133,110 @@ name = "bert-base-cased"
 tokenizer_config = {"use_fast": true}
 
 [components.transformer.model.get_spans]
-@span_getters = "get_doc_spans.v1"
+@span_getters = "doc_spans.v1"
+
+[components.transformer.annotation_setter]
+@annotation_setters = "spacy-transformer.null_annotation_setter.v1"
+
 ```
 
+The `[components.transformer.model]` block describes the `model` argument passed
+to the transformer component. It's a Thinc
+[`Model`](https://thinc.ai/docs/api-model) object that will be passed into the
+component. Here, it references the function
+[spacy-transformers.TransformerModel.v1](/api/architectures#TransformerModel)
+registered in the [`architectures` registry](/api/top-level#registry). If a key
+in a block starts with `@`, it's **resolved to a function** and all other
+settings are passed to the function as arguments. In this case, `name`,
+`tokenizer_config` and `get_spans`.
+
+`get_spans` is a function that takes a batch of `Doc` object and returns lists
+of potentially overlapping `Span` objects to process by the transformer. Several
+[built-in functions](/api/transformer#span-getters) are available – for example,
+to process the whole document or individual sentences. When the config is
+resolved, the function is created and passed into the model as an argument.
+
+<Infobox variant="warning">
+
+Remember that the `config.cfg` used for training should contain **no missing
+values** and requires all settings to be defined. You don't want any hidden
+defaults creeping in and changing your results! spaCy will tell you if settings
+are missing, and you can run [`spacy debug config`](/api/cli#debug-config) with
+`--auto-fill` to automatically fill in all defaults.
+
+<!-- TODO: update with details on getting started with a config -->
+
+</Infobox>
+
+### Customizing the settings {#training-custom-settings}
+
+To change any of the settings, you can edit the `config.cfg` and re-run the
+training. To change any of the functions, like the span getter, you can replace
+the name of the referenced function – e.g. `@span_getters = "sent_spans.v1"` to
+process sentences. You can also register your own functions using the
+`span_getters` registry:
+
+> #### config.cfg
+>
+> ```ini
+> [components.transformer.model.get_spans]
+> @span_getters = "custom_sent_spans"
+> ```
+
 ```python
-from spacy_transformers import Transformer
+### code.py
+import spacy_transformers
 
-trf = Transformer(
-    nlp.vocab,
-    TransformerModel(
-        "bert-base-cased",
-        get_spans=get_doc_spans,
-        tokenizer_config={"use_fast": True},
-    ),
-    annotation_setter=null_annotation_setter,
-    max_batch_size=32,
-)
+@spacy_transformers.registry.span_getters("custom_sent_spans")
+def configure_custom_sent_spans():
+    # TODO: write custom example
+    def get_sent_spans(docs):
+        return [list(doc.sents) for doc in docs]
+
+    return get_sent_spans
 ```
 
-The `components.transformer` block adds the `transformer` component to the
-pipeline, and the `components.transformer.model` block describes the creation of
-a Thinc [`Model`](https://thinc.ai/docs/api-model) object that will be passed
-into the component. The block names a function registered in the
-`@architectures` registry. This function will be looked up and called using the
-provided arguments. You're not limited to just that function --- you can write
-your own or use someone else's. The only limitation is that it must return an
-object of type `Model[List[Doc], FullTransformerBatch]`: that is, a Thinc model
-that takes a list of `Doc` objects, and returns a `FullTransformerBatch` object
-with the transformer data.
+To resolve the config during training, spaCy needs to know about your custom
+function. You can make it available via the `--code` argument that can point to
+a Python file:
 
-The same idea applies to task models that power the downstream components. Most
-of spaCy's built-in model creation functions support a `tok2vec` argument, which
-should be a Thinc layer of type `Model[List[Doc], List[Floats2d]]`. This is
-where we'll plug in our transformer model, using the `Tok2VecTransformer` layer,
-which sneakily delegates to the `Transformer` pipeline component.
+```bash
+$ python -m spacy train ./train.spacy ./dev.spacy ./config.cfg --code ./code.py
+```
+
+### Customizing the model implementations {#training-custom-model}
+
+The [`Transformer`](/api/transformer) component expects a Thinc
+[`Model`](https://thinc.ai/docs/api-model) object to be passed in as its `model`
+argument. You're not limited to the implementation provided by
+`spacy-transformers` – the only requirement is that your registered function
+must return an object of type `Model[List[Doc], FullTransformerBatch]`: that is,
+a Thinc model that takes a list of [`Doc`](/api/doc) objects, and returns a
+[`FullTransformerBatch`](/api/transformer#fulltransformerbatch) object with the
+transformer data.
+
+> #### Model type annotations
+>
+> In the documentation and code base, you may come across type annotations and
+> descriptions of [Thinc](https://thinc.ai) model types, like
+> `Model[List[Doc], List[Floats2d]]`. This so-called generic type describes the
+> layer and its input and output type – in this case, it takes a list of `Doc`
+> objects as the input and list of 2-dimensional arrays of floats as the output.
+> You can read more about defining Thinc
+> models [here](https://thinc.ai/docs/usage-models). Also see the
+> [type checking](https://thinc.ai/docs/usage-type-checking) for how to enable
+> linting in your editor to see live feedback if your inputs and outputs don't
+> match.
+
+The same idea applies to task models that power the **downstream components**.
+Most of spaCy's built-in model creation functions support a `tok2vec` argument,
+which should be a Thinc layer of type `Model[List[Doc], List[Floats2d]]`. This
+is where we'll plug in our transformer model, using the
+[Tok2VecListener](/api/architectures#Tok2VecListener) layer, which sneakily
+delegates to the `Transformer` pipeline component.
 
 ```ini
-[nlp]
-lang = "en"
-pipeline = ["ner"]
-
+### config.cfg (excerpt) {highlight="12"}
 [components.ner]
 factory = "ner"
 
@@ -108,49 +255,24 @@ grad_factor = 1.0
 @layers = "reduce_mean.v1"
 ```
 
-The `Tok2VecListener` layer expects a `pooling` layer, which needs to be of type
-`Model[Ragged, Floats2d]`. This layer determines how the vector for each spaCy
-token will be computed from the zero or more source rows the token is aligned
-against. Here we use the `reduce_mean` layer, which averages the wordpiece rows.
-We could instead use `reduce_last`, `reduce_max`, or a custom function you write
-yourself.
+The [Tok2VecListener](/api/architectures#Tok2VecListener) layer expects a
+[pooling layer](https://thinc.ai/docs/api-layers#reduction-ops), which needs to
+be of type `Model[Ragged, Floats2d]`. This layer determines how the vector for
+each spaCy token will be computed from the zero or more source rows the token is
+aligned against. Here we use the
+[`reduce_mean`](https://thinc.ai/docs/api-layers#reduce_mean) layer, which
+averages the wordpiece rows. We could instead use `reduce_last`,
+[`reduce_max`](https://thinc.ai/docs/api-layers#reduce_max), or a custom
+function you write yourself.
+
+<!--TODO: reduce_last: undocumented? -->
 
 You can have multiple components all listening to the same transformer model,
 and all passing gradients back to it. By default, all of the gradients will be
-equally weighted. You can control this with the `grad_factor` setting, which
+**equally weighted**. You can control this with the `grad_factor` setting, which
 lets you reweight the gradients from the different listeners. For instance,
 setting `grad_factor = 0` would disable gradients from one of the listeners,
 while `grad_factor = 2.0` would multiply them by 2. This is similar to having a
 custom learning rate for each component. Instead of a constant, you can also
 provide a schedule, allowing you to freeze the shared parameters at the start of
 training.
-
-### Runtime usage
-
-Transformer models can be used as drop-in replacements for other types of neural
-networks, so your spaCy pipeline can include them in a way that's completely
-invisible to the user. Users will download, load and use the model in the
-standard way, like any other spaCy pipeline.
-
-Instead of using the transformers as subnetworks directly, you can also use them
-via the [`Transformer`](/api/transformer) pipeline component. This sets the
-[`Doc._.trf_data`](/api/transformer#custom_attributes) extension attribute,
-which lets you access the transformers outputs at runtime via the
-`doc._.trf_data` extension attribute. You can also customize how the
-`Transformer` object sets annotations onto the `Doc`, by customizing the
-`Transformer.annotation_setter` object. This callback will be called with the
-raw input and output data for the whole batch, along with the batch of `Doc`
-objects, allowing you to implement whatever you need.
-
-```python
-import spacy
-
-nlp = spacy.load("en_core_trf_lg")
-for doc in nlp.pipe(["some text", "some other text"]):
-    doc._.trf_data.tensors
-    tokvecs = doc._.trf_data.tensors[-1]
-```
-
-The `nlp` object in this example is just like any other spaCy pipeline
-
- -->