mirror of
https://github.com/explosion/spaCy.git
synced 2025-07-01 10:23:07 +03:00
Update docs [ci skip]
This commit is contained in:
parent
f5bcc10268
commit
74cb6d39d0
|
@ -303,22 +303,23 @@ architectures into your training config.
|
||||||
> stride = 96
|
> stride = 96
|
||||||
> ```
|
> ```
|
||||||
|
|
||||||
Load and wrap a transformer model from the Huggingface transformers library.
|
Load and wrap a transformer model from the
|
||||||
You can any transformer that has pretrained weights and a PyTorch
|
[HuggingFace `transformers`](https://huggingface.co/transformers) library. You
|
||||||
implementation. The `name` variable is passed through to the underlying
|
can any transformer that has pretrained weights and a PyTorch implementation.
|
||||||
library, so it can be either a string or a path. If it's a string, the
|
The `name` variable is passed through to the underlying library, so it can be
|
||||||
pretrained weights will be downloaded via the transformers library if they are
|
either a string or a path. If it's a string, the pretrained weights will be
|
||||||
not already available locally.
|
downloaded via the transformers library if they are not already available
|
||||||
|
locally.
|
||||||
In order to support longer documents, the `TransformerModel` layer allows you
|
|
||||||
to pass in a `get_spans` function that will divide up the `Doc` objects before
|
|
||||||
passing them through the transformer. Your spans are allowed to overlap or
|
|
||||||
exclude tokens.
|
|
||||||
|
|
||||||
This layer is usually used directly by the `Transformer` component, which
|
|
||||||
allows you to share the transformer weights across your pipeline. For a layer
|
|
||||||
that's configured for use in other components, see `Tok2VecTransformer`.
|
|
||||||
|
|
||||||
|
In order to support longer documents, the
|
||||||
|
[TransformerModel](/api/architectures#TransformerModel) layer allows you to pass
|
||||||
|
in a `get_spans` function that will divide up the [`Doc`](/api/doc) objects
|
||||||
|
before passing them through the transformer. Your spans are allowed to overlap
|
||||||
|
or exclude tokens. This layer is usually used directly by the
|
||||||
|
[`Transformer`](/api/transformer) component, which allows you to share the
|
||||||
|
transformer weights across your pipeline. For a layer that's configured for use
|
||||||
|
in other components, see
|
||||||
|
[Tok2VecTransformer](/api/architectures#Tok2VecTransformer).
|
||||||
|
|
||||||
| Name | Description |
|
| Name | Description |
|
||||||
| ------------------ | ----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
|
| ------------------ | ----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
|
||||||
|
|
92
website/docs/usage/architectures.md
Normal file
92
website/docs/usage/architectures.md
Normal file
|
@ -0,0 +1,92 @@
|
||||||
|
---
|
||||||
|
title: Layers and Model Architectures
|
||||||
|
teaser: Power spaCy components with custom neural networks
|
||||||
|
menu:
|
||||||
|
- ['Type Signatures', 'type-sigs']
|
||||||
|
- ['Defining Sublayers', 'sublayers']
|
||||||
|
- ['PyTorch & TensorFlow', 'frameworks']
|
||||||
|
- ['Trainable Components', 'components']
|
||||||
|
---
|
||||||
|
|
||||||
|
A **model architecture** is a function that wires up a
|
||||||
|
[Thinc `Model`](https://thinc.ai/docs/api-model) instance, which you can then
|
||||||
|
use in a component or as a layer of a larger network. You can use Thinc as a
|
||||||
|
thin wrapper around frameworks such as PyTorch, TensorFlow or MXNet, or you can
|
||||||
|
implement your logic in Thinc directly. spaCy's built-in components will never
|
||||||
|
construct their `Model` instances themselves, so you won't have to subclass the
|
||||||
|
component to change its model architecture. You can just **update the config**
|
||||||
|
so that it refers to a different registered function. Once the component has
|
||||||
|
been created, its model instance has already been assigned, so you cannot change
|
||||||
|
its model architecture. The architecture is like a recipe for the network, and
|
||||||
|
you can't change the recipe once the dish has already been prepared. You have to
|
||||||
|
make a new one.
|
||||||
|
|
||||||
|
## Type signatures {#type-sigs}
|
||||||
|
|
||||||
|
The Thinc `Model` class is a **generic type** that can specify its input and
|
||||||
|
output types. Python uses a square-bracket notation for this, so the type
|
||||||
|
~~Model[List, Dict]~~ says that each batch of inputs to the model will be a
|
||||||
|
list, and the outputs will be a dictionary. Both `typing.List` and `typing.Dict`
|
||||||
|
are also generics, allowing you to be more specific about the data. For
|
||||||
|
instance, you can write ~~Model[List[Doc], Dict[str, float]]~~ to specify that
|
||||||
|
the model expects a list of [`Doc`](/api/doc) objects as input, and returns a
|
||||||
|
dictionary mapping strings to floats. Some of the most common types you'll see
|
||||||
|
are:
|
||||||
|
|
||||||
|
| Type | Description |
|
||||||
|
| ------------------ | ---------------------------------------------------------------------------------------------------- |
|
||||||
|
| ~~List[Doc]~~ | A batch of [`Doc`](/api/doc) objects. Most components expect their models to take this as input. |
|
||||||
|
| ~~Floats2d~~ | A two-dimensional `numpy` or `cupy` array of floats. Usually 32-bit. |
|
||||||
|
| ~~Ints2d~~ | A two-dimensional `numpy` or `cupy` array of integers. Common dtypes include uint64, int32 and int8. |
|
||||||
|
| ~~List[Floats2d]~~ | A list of two-dimensional arrays, generally with one array per `Doc` and one row per token. |
|
||||||
|
| ~~Ragged~~ | A container to handle variable-length sequence data in an unpadded contiguous array. |
|
||||||
|
| ~~Padded~~ | A container to handle variable-length sequence data in a passed contiguous array. |
|
||||||
|
|
||||||
|
The model type-signatures help you figure out which model architectures and
|
||||||
|
components can fit together. For instance, the
|
||||||
|
[`TextCategorizer`](/api/textcaregorizer) class expects a model typed
|
||||||
|
~~Model[List[Doc], Floats2d]~~, because the model will predict one row of
|
||||||
|
category probabilities per `Doc`. In contrast, the `Tagger` class expects a
|
||||||
|
model typed ~~Model[List[Doc], List[Floats2d]]~~, because it needs to predict
|
||||||
|
one row of probabilities per token. There's no guarantee that two models with
|
||||||
|
the same type-signature can be used interchangeably. There are many other ways
|
||||||
|
they could be incompatible. However, if the types don't match, they almost
|
||||||
|
surely _won't_ be compatible. This little bit of validation goes a long way,
|
||||||
|
especially if you configure your editor or other tools to highlight these errors
|
||||||
|
early. Thinc will also verify that your types match correctly when your config
|
||||||
|
file is processed at the beginning of training.
|
||||||
|
|
||||||
|
## Defining sublayers {#sublayers}
|
||||||
|
|
||||||
|
Model architecture functions often accept sublayers as arguments, so that you
|
||||||
|
can try substituting a different layer into the network. Depending on how the
|
||||||
|
architecture function is structured, you might be able to define your network
|
||||||
|
structure entirely through the [config system](/usage/training#config), using
|
||||||
|
layers that have already been defined. The
|
||||||
|
[transformers documentation](/usage/embeddings-transformers#transformers)
|
||||||
|
section shows a common example of swapping in a different sublayer. In most NLP
|
||||||
|
neural network models, the most important parts of the network are what we refer
|
||||||
|
to as the
|
||||||
|
[embed and encode](https://explosion.ai/blog/embed-encode-attend-predict) steps.
|
||||||
|
These steps together compute dense, context-sensitive representations of the
|
||||||
|
tokens. Most of spaCy's default architectures accept a `tok2vec` layer as an
|
||||||
|
argument, so you can control this important part of the network separately. This
|
||||||
|
makes it easy to switch between transformer, CNN, BiLSTM or other feature
|
||||||
|
extraction approaches. And if you want to define your own solution, all you need
|
||||||
|
to do is register a ~~Model[List[Doc], List[Floats2d]]~~ architecture function,
|
||||||
|
and you'll be able to try it out in any of spaCy components.
|
||||||
|
|
||||||
|
### Registering new architectures
|
||||||
|
|
||||||
|
- Recap concept, link to config docs.
|
||||||
|
|
||||||
|
## Wrapping PyTorch, TensorFlow and other frameworks {#frameworks}
|
||||||
|
|
||||||
|
- Explain concept
|
||||||
|
- Link off to notebook
|
||||||
|
|
||||||
|
## Models for trainable components {#components}
|
||||||
|
|
||||||
|
- Interaction with `predict`, `get_loss` and `set_annotations`
|
||||||
|
- Initialization life-cycle with `begin_training`.
|
||||||
|
- Link to relation extraction notebook.
|
|
@ -24,6 +24,11 @@
|
||||||
"tag": "new"
|
"tag": "new"
|
||||||
},
|
},
|
||||||
{ "text": "Training Models", "url": "/usage/training", "tag": "new" },
|
{ "text": "Training Models", "url": "/usage/training", "tag": "new" },
|
||||||
|
{
|
||||||
|
"text": "Layers & Model Architectures",
|
||||||
|
"url": "/usage/architectures",
|
||||||
|
"tag": "new"
|
||||||
|
},
|
||||||
{ "text": "spaCy Projects", "url": "/usage/projects", "tag": "new" },
|
{ "text": "spaCy Projects", "url": "/usage/projects", "tag": "new" },
|
||||||
{ "text": "Saving & Loading", "url": "/usage/saving-loading" },
|
{ "text": "Saving & Loading", "url": "/usage/saving-loading" },
|
||||||
{ "text": "Visualizers", "url": "/usage/visualizers" }
|
{ "text": "Visualizers", "url": "/usage/visualizers" }
|
||||||
|
|
|
@ -29,6 +29,8 @@
|
||||||
"Optimizer": "https://thinc.ai/docs/api-optimizers",
|
"Optimizer": "https://thinc.ai/docs/api-optimizers",
|
||||||
"Model": "https://thinc.ai/docs/api-model",
|
"Model": "https://thinc.ai/docs/api-model",
|
||||||
"Ragged": "https://thinc.ai/docs/api-types#ragged",
|
"Ragged": "https://thinc.ai/docs/api-types#ragged",
|
||||||
|
"Padded": "https://thinc.ai/docs/api-types#padded",
|
||||||
|
"Ints2d": "https://thinc.ai/docs/api-types#types",
|
||||||
"Floats2d": "https://thinc.ai/docs/api-types#types",
|
"Floats2d": "https://thinc.ai/docs/api-types#types",
|
||||||
"Floats3d": "https://thinc.ai/docs/api-types#types",
|
"Floats3d": "https://thinc.ai/docs/api-types#types",
|
||||||
"FloatsXd": "https://thinc.ai/docs/api-types#types",
|
"FloatsXd": "https://thinc.ai/docs/api-types#types",
|
||||||
|
|
|
@ -67,7 +67,7 @@
|
||||||
border: 0
|
border: 0
|
||||||
|
|
||||||
// Special style for types in API tables
|
// Special style for types in API tables
|
||||||
td > &:last-child
|
td:not(:first-child) > &:last-child
|
||||||
display: block
|
display: block
|
||||||
border-top: 1px dotted var(--color-subtle)
|
border-top: 1px dotted var(--color-subtle)
|
||||||
border-radius: 0
|
border-radius: 0
|
||||||
|
|
Loading…
Reference in New Issue
Block a user