Update docs and formatting

2025-05-03 15:23:41 +03:00 · 2020-09-09 21:26:10 +02:00 · 2020-09-09 21:26:10 +02:00 · 2e567a47c2
commit 2e567a47c2
parent aa27e3f1f2
8 changed files with 165 additions and 151 deletions
--- a/website/docs/api/dependencyparser.md
+++ b/website/docs/api/dependencyparser.md
@ -293,7 +293,11 @@ context, the original parameters are restored.
 ## DependencyParser.add_label {#add_label tag="method"}
-Add a new label to the pipe.
+Add a new label to the pipe. Note that you don't have to call this method if you
 provide a **representative data sample** to the
 [`begin_training`](#begin_training) method. In this case, all labels found in
 the sample will be automatically added to the model, and the output dimension
 will be [inferred](/usage/layers-architectures#shape-inference) automatically.
 > #### Example
 >
@ -307,17 +311,13 @@ Add a new label to the pipe.
 | `label`     | The label to add. ~~str~~                                   |
 | **RETURNS** | `0` if the label is already present, otherwise `1`. ~~int~~ |
 Note that you don't have to call `pipe.add_label` if you provide a
 representative data sample to the [`begin_training`](#begin_training) method. In
 this case, all labels found in the sample will be automatically added to the
 model, and the output dimension will be
 [inferred](/usage/layers-architectures#shape-inference) automatically.
 ## DependencyParser.set_output {#set_output tag="method"}
 Change the output dimension of the component's model by calling the model's
 attribute `resize_output`. This is a function that takes the original model and
-the new output dimension `nO`, and changes the model in place.
+the new output dimension `nO`, and changes the model in place. When resizing an
 already trained model, care should be taken to avoid the "catastrophic
 forgetting" problem.
 > #### Example
 >
@ -330,9 +330,6 @@ the new output dimension `nO`, and changes the model in place.
 | ---- | --------------------------------- |
 | `nO` | The new output dimension. ~~int~~ |
 When resizing an already trained model, care should be taken to avoid the
 "catastrophic forgetting" problem.
 ## DependencyParser.to_disk {#to_disk tag="method"}
 Serialize the pipe to disk.
--- a/website/docs/api/entityrecognizer.md
+++ b/website/docs/api/entityrecognizer.md
@ -281,7 +281,11 @@ context, the original parameters are restored.
 ## EntityRecognizer.add_label {#add_label tag="method"}
-Add a new label to the pipe.
+Add a new label to the pipe. Note that you don't have to call this method if you
 provide a **representative data sample** to the
 [`begin_training`](#begin_training) method. In this case, all labels found in
 the sample will be automatically added to the model, and the output dimension
 will be [inferred](/usage/layers-architectures#shape-inference) automatically.
 > #### Example
 >
@ -295,17 +299,13 @@ Add a new label to the pipe.
 | `label`     | The label to add. ~~str~~                                   |
 | **RETURNS** | `0` if the label is already present, otherwise `1`. ~~int~~ |
 Note that you don't have to call `pipe.add_label` if you provide a
 representative data sample to the [`begin_training`](#begin_training) method. In
 this case, all labels found in the sample will be automatically added to the
 model, and the output dimension will be
 [inferred](/usage/layers-architectures#shape-inference) automatically.
 ## EntityRecognizer.set_output {#set_output tag="method"}
 Change the output dimension of the component's model by calling the model's
 attribute `resize_output`. This is a function that takes the original model and
-the new output dimension `nO`, and changes the model in place.
+the new output dimension `nO`, and changes the model in place. When resizing an
 already trained model, care should be taken to avoid the "catastrophic
 forgetting" problem.
 > #### Example
 >
@ -318,9 +318,6 @@ the new output dimension `nO`, and changes the model in place.
 | ---- | --------------------------------- |
 | `nO` | The new output dimension. ~~int~~ |
 When resizing an already trained model, care should be taken to avoid the
 "catastrophic forgetting" problem.
 ## EntityRecognizer.to_disk {#to_disk tag="method"}
 Serialize the pipe to disk.
--- a/website/docs/api/morphologizer.md
+++ b/website/docs/api/morphologizer.md
@ -259,7 +259,11 @@ context, the original parameters are restored.
 Add a new label to the pipe. If the `Morphologizer` should set annotations for
 both `pos` and `morph`, the label should include the UPOS as the feature `POS`.
 Raises an error if the output dimension is already set, or if the model has
-already been fully [initialized](#begin_training).
+already been fully [initialized](#begin_training). Note that you don't have to
 call this method if you provide a **representative data sample** to the
 [`begin_training`](#begin_training) method. In this case, all labels found in
 the sample will be automatically added to the model, and the output dimension
 will be [inferred](/usage/layers-architectures#shape-inference) automatically.
 > #### Example
 >
@ -273,12 +277,6 @@ already been fully [initialized](#begin_training).
 | `label`     | The label to add. ~~str~~                                   |
 | **RETURNS** | `0` if the label is already present, otherwise `1`. ~~int~~ |
 Note that you don't have to call `pipe.add_label` if you provide a
 representative data sample to the [`begin_training`](#begin_training) method. In
 this case, all labels found in the sample will be automatically added to the
 model, and the output dimension will be
 [inferred](/usage/layers-architectures#shape-inference) automatically.
 ## Morphologizer.to_disk {#to_disk tag="method"}
 Serialize the pipe to disk.
--- a/website/docs/api/pipe.md
+++ b/website/docs/api/pipe.md
@ -293,12 +293,6 @@ context, the original parameters are restored.
 > pipe.add_label("MY_LABEL")
 > ```
 <Infobox variant="danger">
 This method needs to be overwritten with your own custom `add_label` method.
 </Infobox>
 Add a new label to the pipe, to be predicted by the model. The actual
 implementation depends on the specific component, but in general `add_label`
 shouldn't be called if the output dimension is already set, or if the model has
@ -308,6 +302,12 @@ the component is [resizable](#is_resizable), in which case
 [`set_output`](#set_output) should be called to ensure that the model is
 properly resized.
 <Infobox variant="danger">
 This method needs to be overwritten with your own custom `add_label` method.
 </Infobox>
 | Name        | Description                                             |
 | ----------- | ------------------------------------------------------- |
 | `label`     | The label to add. ~~str~~                               |
@ -326,41 +326,37 @@ model, and the output dimension will be
 > ```python
 > can_resize = pipe.is_resizable()
 > ```
 >
 > ```python
 > ### Custom resizing
 > def custom_resize(model, new_nO):
 >     # adjust model
 >     return model
 >
 > custom_model.attrs["resize_output"] = custom_resize
 > ```
 Check whether or not the output dimension of the component's model can be
 resized. If this method returns `True`, [`set_output`](#set_output) can be
 called to change the model's output dimension.
 For built-in components that are not resizable, you have to create and train a
 new model from scratch with the appropriate architecture and output dimension.
 For custom components, you can implement a `resize_output` function and add it
 as an attribute to the component's model.
 | Name        | Description                                                                                    |
 | ----------- | ---------------------------------------------------------------------------------------------- |
 | **RETURNS** | Whether or not the output dimension of the model can be changed after initialization. ~~bool~~ |
 > #### Example
 >
 > ```python
 > def custom_resize(model, new_nO):
 >     # adjust model
 >     return model
 > custom_model.attrs["resize_output"] = custom_resize
 > ```
 For built-in components that are not resizable, you have to create and train a
 new model from scratch with the appropriate architecture and output dimension.
 For custom components, you can implement a `resize_output` function and add it
 as an attribute to the component's model.
 ## Pipe.set_output {#set_output tag="method"}
 Change the output dimension of the component's model. If the component is not
-[resizable](#is_resizable), this method will throw a `NotImplementedError`.
+[resizable](#is_resizable), this method will raise a `NotImplementedError`. If a
-
+component is resizable, the model's attribute `resize_output` will be called.
-If a component is resizable, the model's attribute `resize_output` will be
+This is a function that takes the original model and the new output dimension
-called. This is a function that takes the original model and the new output
+`nO`, and changes the model in place. When resizing an already trained model,
-dimension `nO`, and changes the model in place.
+care should be taken to avoid the "catastrophic forgetting" problem.
 When resizing an already trained model, care should be taken to avoid the
 "catastrophic forgetting" problem.
 > #### Example
 >
--- a/website/docs/api/tagger.md
+++ b/website/docs/api/tagger.md
@ -289,7 +289,12 @@ context, the original parameters are restored.
 ## Tagger.add_label {#add_label tag="method"}
 Add a new label to the pipe. Raises an error if the output dimension is already
-set, or if the model has already been fully [initialized](#begin_training).
+set, or if the model has already been fully [initialized](#begin_training). Note
 that you don't have to call this method if you provide a **representative data
 sample** to the [`begin_training`](#begin_training) method. In this case, all
 labels found in the sample will be automatically added to the model, and the
 output dimension will be [inferred](/usage/layers-architectures#shape-inference)
 automatically.
 > #### Example
 >
@ -303,12 +308,6 @@ set, or if the model has already been fully [initialized](#begin_training).
 | `label`     | The label to add. ~~str~~                                   |
 | **RETURNS** | `0` if the label is already present, otherwise `1`. ~~int~~ |
 Note that you don't have to call `pipe.add_label` if you provide a
 representative data sample to the [`begin_training`](#begin_training) method. In
 this case, all labels found in the sample will be automatically added to the
 model, and the output dimension will be
 [inferred](/usage/layers-architectures#shape-inference) automatically.
 ## Tagger.to_disk {#to_disk tag="method"}
 Serialize the pipe to disk.
--- a/website/docs/api/textcategorizer.md
+++ b/website/docs/api/textcategorizer.md
@ -298,7 +298,12 @@ Modify the pipe's model, to use the given parameter values.
 ## TextCategorizer.add_label {#add_label tag="method"}
 Add a new label to the pipe. Raises an error if the output dimension is already
-set, or if the model has already been fully [initialized](#begin_training).
+set, or if the model has already been fully [initialized](#begin_training). Note
 that you don't have to call this method if you provide a **representative data
 sample** to the [`begin_training`](#begin_training) method. In this case, all
 labels found in the sample will be automatically added to the model, and the
 output dimension will be [inferred](/usage/layers-architectures#shape-inference)
 automatically.
 > #### Example
 >
@ -312,12 +317,6 @@ set, or if the model has already been fully [initialized](#begin_training).
 | `label`     | The label to add. ~~str~~                                   |
 | **RETURNS** | `0` if the label is already present, otherwise `1`. ~~int~~ |
 Note that you don't have to call `pipe.add_label` if you provide a
 representative data sample to the [`begin_training`](#begin_training) method. In
 this case, all labels found in the sample will be automatically added to the
 model, and the output dimension will be
 [inferred](/usage/layers-architectures#shape-inference) automatically.
 ## TextCategorizer.to_disk {#to_disk tag="method"}
 Serialize the pipe to disk.
--- a/website/docs/usage/layers-architectures.md
+++ b/website/docs/usage/layers-architectures.md
@ -5,8 +5,7 @@ menu:
  - ['Type Signatures', 'type-sigs']
  - ['Swapping Architectures', 'swap-architectures']
  - ['PyTorch & TensorFlow', 'frameworks']
-  - ['Custom Models', 'custom-models']
+  - ['Custom Thinc Models', 'thinc']
  - ['Thinc implementation', 'thinc']
  - ['Trainable Components', 'components']
 next: /usage/projects
 ---
@ -226,13 +225,24 @@ you'll be able to try it out in any of the spaCy components.
 Thinc allows you to [wrap models](https://thinc.ai/docs/usage-frameworks)
 written in other machine learning frameworks like PyTorch, TensorFlow and MXNet
-using a unified [`Model`](https://thinc.ai/docs/api-model) API.
+using a unified [`Model`](https://thinc.ai/docs/api-model) API. This makes it
-
+easy to use a model implemented in a different framework to power a component in
-For example, let's use PyTorch to define a very simple Neural network consisting
+your spaCy pipeline. For example, to wrap a PyTorch model as a Thinc `Model`,
-of two hidden `Linear` layers with `ReLU` activation and dropout, and a
+you can use Thinc's
-softmax-activated output layer.
+[`PyTorchWrapper`](https://thinc.ai/docs/api-layers#pytorchwrapper):
 ```python
 from thinc.api import PyTorchWrapper
 wrapped_pt_model = PyTorchWrapper(torch_model)
 ```
 Let's use PyTorch to define a very simple neural network consisting of two
 hidden `Linear` layers with `ReLU` activation and dropout, and a
 softmax-activated output layer:
 ```python
 ### PyTorch model
 from torch import nn
 torch_model = nn.Sequential(
@ -246,15 +256,6 @@ torch_model = nn.Sequential(
   )
 ```
 This PyTorch model can be wrapped as a Thinc `Model` by using Thinc's
 `PyTorchWrapper`:
 ```python
 from thinc.api import PyTorchWrapper
 wrapped_pt_model = PyTorchWrapper(torch_model)
 ```
 The resulting wrapped `Model` can be used as a **custom architecture** as such,
 or can be a **subcomponent of a larger model**. For instance, we can use Thinc's
 [`chain`](https://thinc.ai/docs/api-layers#chain) combinator, which works like
@ -273,21 +274,26 @@ model = chain(char_embed, with_array(wrapped_pt_model))
 In the above example, we have combined our custom PyTorch model with a character
 embedding layer defined by spaCy.
 [CharacterEmbed](/api/architectures#CharacterEmbed) returns a `Model` that takes
-a `List[Doc]` as input, and outputs a `List[Floats2d]`. To make sure that the
+a ~~List[Doc]~~ as input, and outputs a ~~List[Floats2d]~~. To make sure that
-wrapped PyTorch model receives valid inputs, we use Thinc's
+the wrapped PyTorch model receives valid inputs, we use Thinc's
 [`with_array`](https://thinc.ai/docs/api-layers#with_array) helper.
-As another example, you could have a model where you use PyTorch just for the
+You could also implement a model that only uses PyTorch for the transformer
-transformer layers, and use "native" Thinc layers to do fiddly input and output
+layers, and "native" Thinc layers to do fiddly input and output transformations
-transformations and add on task-specific "heads", as efficiency is less of a
+and add on task-specific "heads", as efficiency is less of a consideration for
-consideration for those parts of the network.
+those parts of the network.
-## Custom models for trainable components {#custom-models}
+### Using wrapped models {#frameworks-usage}
 To use our custom model including the PyTorch subnetwork, all we need to do is
-register the architecture. The full example then becomes:
+register the architecture using the
 [`architectures` registry](/api/top-level#registry). This will assign the
 architecture a name so spaCy knows how to find it, and allows passing in
 arguments like hyperparameters via the [config](/usage/training#config). The
 full example then becomes:
 ```python
 ### Registering the architecture {highlight="9"}
 from typing import List
 from thinc.types import Floats2d
 from thinc.api import Model, PyTorchWrapper, chain, with_array
@ -297,7 +303,7 @@ from spacy.ml import CharacterEmbed
 from torch import nn
@spacy.registry.architectures("CustomTorchModel.v1")
-def TorchModel(
+def create_torch_model(
    nO: int,
    width: int,
    hidden_width: int,
@ -321,8 +327,10 @@ def TorchModel(
    return model
 ```
-Now you can use this model definition in any existing trainable spaCy component,
+The model definition can now be used in any existing trainable spaCy component,
-by specifying it in the config file:
+by specifying it in the config file. In this configuration, all required
 parameters for the various subcomponents of the custom architecture are passed
 in as settings via the config.
 ```ini
 ### config.cfg (excerpt) {highlight="5-5"}
@ -340,29 +348,28 @@ nC = 8
 dropout = 0.2
 ```
-In this configuration, we pass all required parameters for the various
+<Infobox variant="warning">
 subcomponents of the custom architecture as settings in the training config
 file. Remember that it is best not to rely on any (hidden) default values, to
 ensure that training configs are complete and experiments fully reproducible.
-## Thinc implemention details {#thinc}
+Remember that it is best not to rely on any (hidden) default values, to ensure
 that training configs are complete and experiments fully reproducible.
-Ofcourse it's also possible to define the `Model` from the previous section
+</Infobox>
 ## Custom models with Thinc {#thinc}
 Of course it's also possible to define the `Model` from the previous section
 entirely in Thinc. The Thinc documentation provides details on the
 [various layers](https://thinc.ai/docs/api-layers) and helper functions
-available.
+available. Combinators can also be used to
-
+[overload operators](https://thinc.ai/docs/usage-models#operators) and a common
-The combinators often used in Thinc can be used to
+usage pattern is to bind `chain` to `>>`. The "native" Thinc version of our
-[overload operators](https://thinc.ai/docs/usage-models#operators). A common
+simple neural network would then become:
 usage is to bind `chain` to `>>`. The "native" Thinc version of our simple
 neural network would then become:
 ```python
 from thinc.api import chain, with_array, Model, Relu, Dropout, Softmax
 from spacy.ml import CharacterEmbed
 char_embed = CharacterEmbed(width, embed_size, nM, nC)
 with Model.define_operators({">>": chain}):
    layers = (
        Relu(hidden_width, width)
@ -374,20 +381,37 @@ with Model.define_operators({">>": chain}):
    model = char_embed >> with_array(layers)
 ```
-**⚠️ Note that Thinc layers define the output dimension (`nO`) as the first
+<Infobox variant="warning" title="Important note on inputs and outputs">
 argument, followed (optionally) by the input dimension (`nI`). This is in
 contrast to how the PyTorch layers are defined, where `in_features` precedes
 `out_features`.**
-### Shape inference in thinc {#shape-inference}
+Note that Thinc layers define the output dimension (`nO`) as the first argument,
 followed (optionally) by the input dimension (`nI`). This is in contrast to how
 the PyTorch layers are defined, where `in_features` precedes `out_features`.
-It is not strictly necessary to define all the input and output dimensions for
+</Infobox>
-each layer, as Thinc can perform
+
 ### Shape inference in Thinc {#thinc-shape-inference}
 It is **not** strictly necessary to define all the input and output dimensions
 for each layer, as Thinc can perform
 [shape inference](https://thinc.ai/docs/usage-models#validation) between
 sequential layers by matching up the output dimensionality of one layer to the
 input dimensionality of the next. This means that we can simplify the `layers`
 definition:
 > #### Diff
 >
 > ```diff
 > layers = (
 >     Relu(hidden_width, width)
 >     >> Dropout(dropout)
 > -   >> Relu(hidden_width, hidden_width)
 > +    >> Relu(hidden_width)
 >     >> Dropout(dropout)
 > -   >> Softmax(nO, hidden_width)
 > +   >> Softmax(nO)
 > )
 > ```
 ```python
 with Model.define_operators({">>": chain}):
    layers = (
@ -399,12 +423,14 @@ with Model.define_operators({">>": chain}):
    )
 ```
-Thinc can go one step further and deduce the correct input dimension of the
+Thinc can even go one step further and **deduce the correct input dimension** of
-first layer, and output dimension of the last. To enable this functionality, you
+the first layer, and output dimension of the last. To enable this functionality,
-have to call [`model.initialize`](https://thinc.ai/docs/api-model#initialize)
+you have to call
-with an input sample `X` and an output sample `Y` with the correct dimensions.
+[`Model.initialize`](https://thinc.ai/docs/api-model#initialize) with an **input
 sample** `X` and an **output sample** `Y` with the correct dimensions:
 ```python
 ### Shape inference with initialization {highlight="3,7,10"}
 with Model.define_operators({">>": chain}):
    layers = (
        Relu(hidden_width)
@ -418,21 +444,21 @@ with Model.define_operators({">>": chain}):
 ```
 The built-in [pipeline components](/usage/processing-pipelines) in spaCy ensure
-that their internal models are always initialized with appropriate sample data.
+that their internal models are **always initialized** with appropriate sample
-In this case, `X` is typically a `List` of `Doc` objects, while `Y` is a `List`
+data. In this case, `X` is typically a ~~List[Doc]~~, while `Y` is typically a
-of 1D or 2D arrays, depending on the specific task. This functionality is
+~~List[Array1d]~~ or ~~List[Array2d]~~, depending on the specific task. This
-triggered when [`nlp.begin_training`](/api/language#begin_training) is called.
+functionality is triggered when
 [`nlp.begin_training`](/api/language#begin_training) is called.
-### Dropout and normalization {#drop-norm}
+### Dropout and normalization in Thinc {#thinc-dropout-norm}
-Many of the `Thinc` layers allow you to define a `dropout` argument that will
+Many of the available Thinc [layers](https://thinc.ai/docs/api-layers) allow you
-result in "chaining" an additional
+to define a `dropout` argument that will result in "chaining" an additional
 [`Dropout`](https://thinc.ai/docs/api-layers#dropout) layer. Optionally, you can
 often specify whether or not you want to add layer normalization, which would
 result in an additional
-[`LayerNorm`](https://thinc.ai/docs/api-layers#layernorm) layer.
+[`LayerNorm`](https://thinc.ai/docs/api-layers#layernorm) layer. That means that
-
+the following `layers` definition is equivalent to the previous:
 That means that the following `layers` definition is equivalent to the previous:
 ```python
 with Model.define_operators({">>": chain}):
--- a/website/meta/type-annotations.json
+++ b/website/meta/type-annotations.json
@ -34,6 +34,8 @@
    "Floats2d": "https://thinc.ai/docs/api-types#types",
    "Floats3d": "https://thinc.ai/docs/api-types#types",
    "FloatsXd": "https://thinc.ai/docs/api-types#types",
    "Array1d": "https://thinc.ai/docs/api-types#types",
    "Array2d": "https://thinc.ai/docs/api-types#types",
    "Ops": "https://thinc.ai/docs/api-backends#ops",
    "cymem.Pool": "https://github.com/explosion/cymem",
    "preshed.BloomFilter": "https://github.com/explosion/preshed",