Update docs for pipeline initialize() methods (#11221)

* Update documentation for dependency parser * Update documentation for trainable_lemmatizer * Update documentation for entity_linker * Update documentation for ner * Update documentation for morphologizer * Update documentation for senter * Update documentation for spancat * Update documentation for tagger * Update documentation for textcat * Update documentation for tok2vec * Run prettier on edited files * Apply similar changes in transformer docs * Remove need to say annotated example explicitly I removed the need to say "Must contain at least one annotated Example" because it's often a given that Examples will contain some gold-standard annotation. * Run prettier on transformer docs
2025-07-04 03:43:09 +03:00 · 2022-08-03 22:53:02 +08:00 · 2022-08-03 22:53:02 +08:00 · d993df41e5
commit d993df41e5
parent d0578c2ede
11 changed files with 85 additions and 85 deletions
--- a/website/docs/api/dependencyparser.md
+++ b/website/docs/api/dependencyparser.md
@ -158,10 +158,10 @@ applied to the `Doc` in order. Both [`__call__`](/api/dependencyparser#call) and
 ## DependencyParser.initialize {#initialize tag="method" new="3"}
 Initialize the component for training. `get_examples` should be a function that
-returns an iterable of [`Example`](/api/example) objects. The data examples are
+returns an iterable of [`Example`](/api/example) objects. **At least one example
-used to **initialize the model** of the component and can either be the full
+should be supplied.** The data examples are used to **initialize the model** of
-training data or a representative sample. Initialization includes validating the
+the component and can either be the full training data or a representative
-network,
+sample. Initialization includes validating the network,
 [inferring missing shapes](https://thinc.ai/docs/usage-models#validation) and
 setting up the label scheme based on the data. This method is typically called
 by [`Language.initialize`](/api/language#initialize) and lets you customize
@ -179,7 +179,7 @@ This method was previously called `begin_training`.
 >
 > ```python
 > parser = nlp.add_pipe("parser")
-> parser.initialize(lambda: [], nlp=nlp)
+> parser.initialize(lambda: examples, nlp=nlp)
 > ```
 >
 > ```ini
@ -193,7 +193,7 @@ This method was previously called `begin_training`.
 | Name           | Description                                                                                                                                                                                                                                                                                                                                                                                                            |
 | -------------- | ---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
-| `get_examples` | Function that returns gold-standard annotations in the form of [`Example`](/api/example) objects. ~~Callable[[], Iterable[Example]]~~                                                                                                                                                                                                                                                                                  |
+| `get_examples` | Function that returns gold-standard annotations in the form of [`Example`](/api/example) objects. Must contain at least one `Example`. ~~Callable[[], Iterable[Example]]~~                                                                                                                                                                                                                                             |
 | _keyword-only_ |                                                                                                                                                                                                                                                                                                                                                                                                                        |
 | `nlp`          | The current `nlp` object. Defaults to `None`. ~~Optional[Language]~~                                                                                                                                                                                                                                                                                                                                                   |
 | `labels`       | The label information to add to the component, as provided by the [`label_data`](#label_data) property after initialization. To generate a reusable JSON file from your data, you should run the [`init labels`](/api/cli#init-labels) command. If no labels are provided, the `get_examples` callback is used to extract the labels from the data, which may be a lot slower. ~~Optional[Dict[str, Dict[str, int]]]~~ |
--- a/website/docs/api/edittreelemmatizer.md
+++ b/website/docs/api/edittreelemmatizer.md
@ -141,10 +141,10 @@ and [`pipe`](/api/edittreelemmatizer#pipe) delegate to the
 ## EditTreeLemmatizer.initialize {#initialize tag="method" new="3"}
 Initialize the component for training. `get_examples` should be a function that
-returns an iterable of [`Example`](/api/example) objects. The data examples are
+returns an iterable of [`Example`](/api/example) objects. **At least one example
-used to **initialize the model** of the component and can either be the full
+should be supplied.** The data examples are used to **initialize the model** of
-training data or a representative sample. Initialization includes validating the
+the component and can either be the full training data or a representative
-network,
+sample. Initialization includes validating the network,
 [inferring missing shapes](https://thinc.ai/docs/usage-models#validation) and
 setting up the label scheme based on the data. This method is typically called
 by [`Language.initialize`](/api/language#initialize) and lets you customize
@ -156,7 +156,7 @@ config.
 >
 > ```python
 > lemmatizer = nlp.add_pipe("trainable_lemmatizer", name="lemmatizer")
-> lemmatizer.initialize(lambda: [], nlp=nlp)
+> lemmatizer.initialize(lambda: examples, nlp=nlp)
 > ```
 >
 > ```ini
@ -170,7 +170,7 @@ config.
 | Name           | Description                                                                                                                                                                                                                                                                                                                                                                                                |
 | -------------- | ---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
-| `get_examples` | Function that returns gold-standard annotations in the form of [`Example`](/api/example) objects. ~~Callable[[], Iterable[Example]]~~                                                                                                                                                                                                                                                                      |
+| `get_examples` | Function that returns gold-standard annotations in the form of [`Example`](/api/example) objects. Must contain at least one `Example`. ~~Callable[[], Iterable[Example]]~~                                                                                                                                                                                                                                 |
 | _keyword-only_ |                                                                                                                                                                                                                                                                                                                                                                                                            |
 | `nlp`          | The current `nlp` object. Defaults to `None`. ~~Optional[Language]~~                                                                                                                                                                                                                                                                                                                                       |
 | `labels`       | The label information to add to the component, as provided by the [`label_data`](#label_data) property after initialization. To generate a reusable JSON file from your data, you should run the [`init labels`](/api/cli#init-labels) command. If no labels are provided, the `get_examples` callback is used to extract the labels from the data, which may be a lot slower. ~~Optional[Iterable[str]]~~ |
--- a/website/docs/api/entitylinker.md
+++ b/website/docs/api/entitylinker.md
@ -185,10 +185,10 @@ with the current vocab.
 ## EntityLinker.initialize {#initialize tag="method" new="3"}
 Initialize the component for training. `get_examples` should be a function that
-returns an iterable of [`Example`](/api/example) objects. The data examples are
+returns an iterable of [`Example`](/api/example) objects. **At least one example
-used to **initialize the model** of the component and can either be the full
+should be supplied.** The data examples are used to **initialize the model** of
-training data or a representative sample. Initialization includes validating the
+the component and can either be the full training data or a representative
-network,
+sample. Initialization includes validating the network,
 [inferring missing shapes](https://thinc.ai/docs/usage-models#validation) and
 setting up the label scheme based on the data. This method is typically called
 by [`Language.initialize`](/api/language#initialize).
@ -208,12 +208,12 @@ This method was previously called `begin_training`.
 >
 > ```python
 > entity_linker = nlp.add_pipe("entity_linker")
-> entity_linker.initialize(lambda: [], nlp=nlp, kb_loader=my_kb)
+> entity_linker.initialize(lambda: examples, nlp=nlp, kb_loader=my_kb)
 > ```
 | Name           | Description                                                                                                                                                                |
-| -------------- | ------------------------------------------------------------------------------------------------------------------------------------- |
+| -------------- | -------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
-| `get_examples` | Function that returns gold-standard annotations in the form of [`Example`](/api/example) objects. ~~Callable[[], Iterable[Example]]~~ |
+| `get_examples` | Function that returns gold-standard annotations in the form of [`Example`](/api/example) objects. Must contain at least one `Example`. ~~Callable[[], Iterable[Example]]~~ |
 | _keyword-only_ |                                                                                                                                                                            |
 | `nlp`          | The current `nlp` object. Defaults to `None`. ~~Optional[Language]~~                                                                                                       |
 | `kb_loader`    | Function that creates a [`KnowledgeBase`](/api/kb) from a `Vocab` instance. ~~Callable[[Vocab], KnowledgeBase]~~                                                           |
--- a/website/docs/api/entityrecognizer.md
+++ b/website/docs/api/entityrecognizer.md
@ -154,10 +154,10 @@ applied to the `Doc` in order. Both [`__call__`](/api/entityrecognizer#call) and
 ## EntityRecognizer.initialize {#initialize tag="method" new="3"}
 Initialize the component for training. `get_examples` should be a function that
-returns an iterable of [`Example`](/api/example) objects. The data examples are
+returns an iterable of [`Example`](/api/example) objects. **At least one example
-used to **initialize the model** of the component and can either be the full
+should be supplied.** The data examples are used to **initialize the model** of
-training data or a representative sample. Initialization includes validating the
+the component and can either be the full training data or a representative
-network,
+sample. Initialization includes validating the network,
 [inferring missing shapes](https://thinc.ai/docs/usage-models#validation) and
 setting up the label scheme based on the data. This method is typically called
 by [`Language.initialize`](/api/language#initialize) and lets you customize
@ -175,7 +175,7 @@ This method was previously called `begin_training`.
 >
 > ```python
 > ner = nlp.add_pipe("ner")
-> ner.initialize(lambda: [], nlp=nlp)
+> ner.initialize(lambda: examples, nlp=nlp)
 > ```
 >
 > ```ini
@ -189,7 +189,7 @@ This method was previously called `begin_training`.
 | Name           | Description                                                                                                                                                                                                                                                                                                                                                                                                            |
 | -------------- | ---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
-| `get_examples` | Function that returns gold-standard annotations in the form of [`Example`](/api/example) objects. ~~Callable[[], Iterable[Example]]~~                                                                                                                                                                                                                                                                                  |
+| `get_examples` | Function that returns gold-standard annotations in the form of [`Example`](/api/example) objects. Must contain at least one `Example`. ~~Callable[[], Iterable[Example]]~~                                                                                                                                                                                                                                             |
 | _keyword-only_ |                                                                                                                                                                                                                                                                                                                                                                                                                        |
 | `nlp`          | The current `nlp` object. Defaults to `None`. ~~Optional[Language]~~                                                                                                                                                                                                                                                                                                                                                   |
 | `labels`       | The label information to add to the component, as provided by the [`label_data`](#label_data) property after initialization. To generate a reusable JSON file from your data, you should run the [`init labels`](/api/cli#init-labels) command. If no labels are provided, the `get_examples` callback is used to extract the labels from the data, which may be a lot slower. ~~Optional[Dict[str, Dict[str, int]]]~~ |
--- a/website/docs/api/morphologizer.md
+++ b/website/docs/api/morphologizer.md
@ -147,10 +147,10 @@ applied to the `Doc` in order. Both [`__call__`](/api/morphologizer#call) and
 ## Morphologizer.initialize {#initialize tag="method"}
 Initialize the component for training. `get_examples` should be a function that
-returns an iterable of [`Example`](/api/example) objects. The data examples are
+returns an iterable of [`Example`](/api/example) objects. **At least one example
-used to **initialize the model** of the component and can either be the full
+should be supplied.** The data examples are used to **initialize the model** of
-training data or a representative sample. Initialization includes validating the
+the component and can either be the full training data or a representative
-network,
+sample. Initialization includes validating the network,
 [inferring missing shapes](https://thinc.ai/docs/usage-models#validation) and
 setting up the label scheme based on the data. This method is typically called
 by [`Language.initialize`](/api/language#initialize) and lets you customize
@ -162,7 +162,7 @@ config.
 >
 > ```python
 > morphologizer = nlp.add_pipe("morphologizer")
-> morphologizer.initialize(lambda: [], nlp=nlp)
+> morphologizer.initialize(lambda: examples, nlp=nlp)
 > ```
 >
 > ```ini
@ -176,7 +176,7 @@ config.
 | Name           | Description                                                                                                                                                                                                                                                                                                                                                                                       |
 | -------------- | ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
-| `get_examples` | Function that returns gold-standard annotations in the form of [`Example`](/api/example) objects. ~~Callable[[], Iterable[Example]]~~                                                                                                                                                                                                                                                             |
+| `get_examples` | Function that returns gold-standard annotations in the form of [`Example`](/api/example) objects. Must contain at least one `Example`. ~~Callable[[], Iterable[Example]]~~                                                                                                                                                                                                                        |
 | _keyword-only_ |                                                                                                                                                                                                                                                                                                                                                                                                   |
 | `nlp`          | The current `nlp` object. Defaults to `None`. ~~Optional[Language]~~                                                                                                                                                                                                                                                                                                                              |
 | `labels`       | The label information to add to the component, as provided by the [`label_data`](#label_data) property after initialization. To generate a reusable JSON file from your data, you should run the [`init labels`](/api/cli#init-labels) command. If no labels are provided, the `get_examples` callback is used to extract the labels from the data, which may be a lot slower. ~~Optional[dict]~~ |
--- a/website/docs/api/sentencerecognizer.md
+++ b/website/docs/api/sentencerecognizer.md
@ -132,10 +132,10 @@ and [`pipe`](/api/sentencerecognizer#pipe) delegate to the
 ## SentenceRecognizer.initialize {#initialize tag="method"}
 Initialize the component for training. `get_examples` should be a function that
-returns an iterable of [`Example`](/api/example) objects. The data examples are
+returns an iterable of [`Example`](/api/example) objects. **At least one example
-used to **initialize the model** of the component and can either be the full
+should be supplied.** The data examples are used to **initialize the model** of
-training data or a representative sample. Initialization includes validating the
+the component and can either be the full training data or a representative
-network,
+sample. Initialization includes validating the network,
 [inferring missing shapes](https://thinc.ai/docs/usage-models#validation) and
 setting up the label scheme based on the data. This method is typically called
 by [`Language.initialize`](/api/language#initialize).
@ -144,12 +144,12 @@ by [`Language.initialize`](/api/language#initialize).
 >
 > ```python
 > senter = nlp.add_pipe("senter")
-> senter.initialize(lambda: [], nlp=nlp)
+> senter.initialize(lambda: examples, nlp=nlp)
 > ```
 | Name           | Description                                                                                                                                                                |
-| -------------- | ------------------------------------------------------------------------------------------------------------------------------------- |
+| -------------- | -------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
-| `get_examples` | Function that returns gold-standard annotations in the form of [`Example`](/api/example) objects. ~~Callable[[], Iterable[Example]]~~ |
+| `get_examples` | Function that returns gold-standard annotations in the form of [`Example`](/api/example) objects. Must contain at least one `Example`. ~~Callable[[], Iterable[Example]]~~ |
 | _keyword-only_ |                                                                                                                                                                            |
 | `nlp`          | The current `nlp` object. Defaults to `None`. ~~Optional[Language]~~                                                                                                       |
--- a/website/docs/api/spancategorizer.md
+++ b/website/docs/api/spancategorizer.md
@ -147,10 +147,10 @@ applied to the `Doc` in order. Both [`__call__`](/api/spancategorizer#call) and
 ## SpanCategorizer.initialize {#initialize tag="method"}
 Initialize the component for training. `get_examples` should be a function that
-returns an iterable of [`Example`](/api/example) objects. The data examples are
+returns an iterable of [`Example`](/api/example) objects. **At least one example
-used to **initialize the model** of the component and can either be the full
+should be supplied.** The data examples are used to **initialize the model** of
-training data or a representative sample. Initialization includes validating the
+the component and can either be the full training data or a representative
-network,
+sample. Initialization includes validating the network,
 [inferring missing shapes](https://thinc.ai/docs/usage-models#validation) and
 setting up the label scheme based on the data. This method is typically called
 by [`Language.initialize`](/api/language#initialize) and lets you customize
@ -162,7 +162,7 @@ config.
 >
 > ```python
 > spancat = nlp.add_pipe("spancat")
-> spancat.initialize(lambda: [], nlp=nlp)
+> spancat.initialize(lambda: examples, nlp=nlp)
 > ```
 >
 > ```ini
@ -176,7 +176,7 @@ config.
 | Name           | Description                                                                                                                                                                                                                                                                                                                                                                                                |
 | -------------- | ---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
-| `get_examples` | Function that returns gold-standard annotations in the form of [`Example`](/api/example) objects. ~~Callable[[], Iterable[Example]]~~                                                                                                                                                                                                                                                                      |
+| `get_examples` | Function that returns gold-standard annotations in the form of [`Example`](/api/example) objects. Must contain at least one `Example`. ~~Callable[[], Iterable[Example]]~~                                                                                                                                                                                                                                 |
 | _keyword-only_ |                                                                                                                                                                                                                                                                                                                                                                                                            |
 | `nlp`          | The current `nlp` object. Defaults to `None`. ~~Optional[Language]~~                                                                                                                                                                                                                                                                                                                                       |
 | `labels`       | The label information to add to the component, as provided by the [`label_data`](#label_data) property after initialization. To generate a reusable JSON file from your data, you should run the [`init labels`](/api/cli#init-labels) command. If no labels are provided, the `get_examples` callback is used to extract the labels from the data, which may be a lot slower. ~~Optional[Iterable[str]]~~ |
--- a/website/docs/api/tagger.md
+++ b/website/docs/api/tagger.md
@ -130,10 +130,10 @@ applied to the `Doc` in order. Both [`__call__`](/api/tagger#call) and
 ## Tagger.initialize {#initialize tag="method" new="3"}
 Initialize the component for training. `get_examples` should be a function that
-returns an iterable of [`Example`](/api/example) objects. The data examples are
+returns an iterable of [`Example`](/api/example) objects. **At least one example
-used to **initialize the model** of the component and can either be the full
+should be supplied.** The data examples are used to **initialize the model** of
-training data or a representative sample. Initialization includes validating the
+the component and can either be the full training data or a representative
-network,
+sample. Initialization includes validating the network,
 [inferring missing shapes](https://thinc.ai/docs/usage-models#validation) and
 setting up the label scheme based on the data. This method is typically called
 by [`Language.initialize`](/api/language#initialize) and lets you customize
@ -151,7 +151,7 @@ This method was previously called `begin_training`.
 >
 > ```python
 > tagger = nlp.add_pipe("tagger")
-> tagger.initialize(lambda: [], nlp=nlp)
+> tagger.initialize(lambda: examples, nlp=nlp)
 > ```
 >
 > ```ini
@ -165,7 +165,7 @@ This method was previously called `begin_training`.
 | Name           | Description                                                                                                                                                                                                                                                                                                                                                                                                |
 | -------------- | ---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
-| `get_examples` | Function that returns gold-standard annotations in the form of [`Example`](/api/example) objects. ~~Callable[[], Iterable[Example]]~~                                                                                                                                                                                                                                                                      |
+| `get_examples` | Function that returns gold-standard annotations in the form of [`Example`](/api/example) objects. Must contain at least one `Example`. ~~Callable[[], Iterable[Example]]~~                                                                                                                                                                                                                                 |
 | _keyword-only_ |                                                                                                                                                                                                                                                                                                                                                                                                            |
 | `nlp`          | The current `nlp` object. Defaults to `None`. ~~Optional[Language]~~                                                                                                                                                                                                                                                                                                                                       |
 | `labels`       | The label information to add to the component, as provided by the [`label_data`](#label_data) property after initialization. To generate a reusable JSON file from your data, you should run the [`init labels`](/api/cli#init-labels) command. If no labels are provided, the `get_examples` callback is used to extract the labels from the data, which may be a lot slower. ~~Optional[Iterable[str]]~~ |
--- a/website/docs/api/textcategorizer.md
+++ b/website/docs/api/textcategorizer.md
@ -176,10 +176,10 @@ applied to the `Doc` in order. Both [`__call__`](/api/textcategorizer#call) and
 ## TextCategorizer.initialize {#initialize tag="method" new="3"}
 Initialize the component for training. `get_examples` should be a function that
-returns an iterable of [`Example`](/api/example) objects. The data examples are
+returns an iterable of [`Example`](/api/example) objects. **At least one example
-used to **initialize the model** of the component and can either be the full
+should be supplied.** The data examples are used to **initialize the model** of
-training data or a representative sample. Initialization includes validating the
+the component and can either be the full training data or a representative
-network,
+sample. Initialization includes validating the network,
 [inferring missing shapes](https://thinc.ai/docs/usage-models#validation) and
 setting up the label scheme based on the data. This method is typically called
 by [`Language.initialize`](/api/language#initialize) and lets you customize
@ -197,7 +197,7 @@ This method was previously called `begin_training`.
 >
 > ```python
 > textcat = nlp.add_pipe("textcat")
-> textcat.initialize(lambda: [], nlp=nlp)
+> textcat.initialize(lambda: examples, nlp=nlp)
 > ```
 >
 > ```ini
@ -212,7 +212,7 @@ This method was previously called `begin_training`.
 | Name             | Description                                                                                                                                                                                                                                                                                                                                                                                                |
 | ---------------- | ---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
-| `get_examples`   | Function that returns gold-standard annotations in the form of [`Example`](/api/example) objects. ~~Callable[[], Iterable[Example]]~~                                                                                                                                                                                                                                                                      |
+| `get_examples`   | Function that returns gold-standard annotations in the form of [`Example`](/api/example) objects. Must contain at least one `Example`. ~~Callable[[], Iterable[Example]]~~                                                                                                                                                                                                                                 |
 | _keyword-only_   |                                                                                                                                                                                                                                                                                                                                                                                                            |
 | `nlp`            | The current `nlp` object. Defaults to `None`. ~~Optional[Language]~~                                                                                                                                                                                                                                                                                                                                       |
 | `labels`         | The label information to add to the component, as provided by the [`label_data`](#label_data) property after initialization. To generate a reusable JSON file from your data, you should run the [`init labels`](/api/cli#init-labels) command. If no labels are provided, the `get_examples` callback is used to extract the labels from the data, which may be a lot slower. ~~Optional[Iterable[str]]~~ |
--- a/website/docs/api/tok2vec.md
+++ b/website/docs/api/tok2vec.md
@ -127,10 +127,10 @@ and [`set_annotations`](/api/tok2vec#set_annotations) methods.
 Initialize the component for training and return an
 [`Optimizer`](https://thinc.ai/docs/api-optimizers). `get_examples` should be a
-function that returns an iterable of [`Example`](/api/example) objects. The data
+function that returns an iterable of [`Example`](/api/example) objects. **At
-examples are used to **initialize the model** of the component and can either be
+least one example should be supplied.** The data examples are used to
-the full training data or a representative sample. Initialization includes
+**initialize the model** of the component and can either be the full training
-validating the network,
+data or a representative sample. Initialization includes validating the network,
 [inferring missing shapes](https://thinc.ai/docs/usage-models#validation) and
 setting up the label scheme based on the data. This method is typically called
 by [`Language.initialize`](/api/language#initialize).
@ -139,12 +139,12 @@ by [`Language.initialize`](/api/language#initialize).
 >
 > ```python
 > tok2vec = nlp.add_pipe("tok2vec")
-> tok2vec.initialize(lambda: [], nlp=nlp)
+> tok2vec.initialize(lambda: examples, nlp=nlp)
 > ```
 | Name           | Description                                                                                                                                                                |
-| -------------- | ------------------------------------------------------------------------------------------------------------------------------------- |
+| -------------- | -------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
-| `get_examples` | Function that returns gold-standard annotations in the form of [`Example`](/api/example) objects. ~~Callable[[], Iterable[Example]]~~ |
+| `get_examples` | Function that returns gold-standard annotations in the form of [`Example`](/api/example) objects. Must contain at least one `Example`. ~~Callable[[], Iterable[Example]]~~ |
 | _keyword-only_ |                                                                                                                                                                            |
 | `nlp`          | The current `nlp` object. Defaults to `None`. ~~Optional[Language]~~                                                                                                       |
--- a/website/docs/api/transformer.md
+++ b/website/docs/api/transformer.md
@ -175,10 +175,10 @@ applied to the `Doc` in order. Both [`__call__`](/api/transformer#call) and
 Initialize the component for training and return an
 [`Optimizer`](https://thinc.ai/docs/api-optimizers). `get_examples` should be a
-function that returns an iterable of [`Example`](/api/example) objects. The data
+function that returns an iterable of [`Example`](/api/example) objects. **At
-examples are used to **initialize the model** of the component and can either be
+least one example should be supplied.** The data examples are used to
-the full training data or a representative sample. Initialization includes
+**initialize the model** of the component and can either be the full training
-validating the network,
+data or a representative sample. Initialization includes validating the network,
 [inferring missing shapes](https://thinc.ai/docs/usage-models#validation) and
 setting up the label scheme based on the data. This method is typically called
 by [`Language.initialize`](/api/language#initialize).
@ -187,12 +187,12 @@ by [`Language.initialize`](/api/language#initialize).
 >
 > ```python
 > trf = nlp.add_pipe("transformer")
-> trf.initialize(lambda: iter([]), nlp=nlp)
+> trf.initialize(lambda: examples, nlp=nlp)
 > ```
 | Name           | Description                                                                                                                                                                |
-| -------------- | ------------------------------------------------------------------------------------------------------------------------------------- |
+| -------------- | -------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
-| `get_examples` | Function that returns gold-standard annotations in the form of [`Example`](/api/example) objects. ~~Callable[[], Iterable[Example]]~~ |
+| `get_examples` | Function that returns gold-standard annotations in the form of [`Example`](/api/example) objects. Must contain at least one `Example`. ~~Callable[[], Iterable[Example]]~~ |
 | _keyword-only_ |                                                                                                                                                                            |
 | `nlp`          | The current `nlp` object. Defaults to `None`. ~~Optional[Language]~~                                                                                                       |