Move TextCatCNN docs to legacy, in prep for moving to spacy-legacy

This commit is contained in:
Daniël de Kok 2023-12-08 10:38:29 +01:00
parent 71aa6f4628
commit 9fb573e255
3 changed files with 50 additions and 44 deletions

View File

@ -55,6 +55,7 @@ redirects = [
{from = "/models/comparison", to = "/models", force = true},
{from = "/api/#section-cython", to = "/api/cython", force = true},
{from = "/api/#cython", to = "/api/cython", force = true},
{from = "/api/architectures#TextCatCNN", to = "/api/legacy#TextCatCNN_v2", force = true},
{from = "/api/sentencesegmenter", to="/api/sentencizer"},
{from = "/universe", to = "/universe/project/:id", query = {id = ":id"}, force = true},
{from = "/universe", to = "/universe/category/:category", query = {category = ":category"}, force = true},

View File

@ -1018,49 +1018,6 @@ but used an internal `tok2vec` instead of taking it as argument:
</Accordion>
### spacy.TextCatCNN.v2 {id="TextCatCNN"}
> #### Example Config
>
> ```ini
> [model]
> @architectures = "spacy.TextCatCNN.v2"
> exclusive_classes = false
> nO = null
>
> [model.tok2vec]
> @architectures = "spacy.HashEmbedCNN.v2"
> pretrained_vectors = null
> width = 96
> depth = 4
> embed_size = 2000
> window_size = 1
> maxout_pieces = 3
> subword_features = true
> ```
A neural network model where token vectors are calculated using a CNN. The
vectors are mean pooled and used as features in a feed-forward network. This
architecture is usually less accurate than the ensemble, but runs faster.
This model is identical to [TexCatReduce.v1](#TextCatReduce) with
`use_reduce_mean=true`, `use_reduce_first=false` and `use_reduce_max=false`.
| Name | Description |
| ------------------- | ---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
| `exclusive_classes` | Whether or not categories are mutually exclusive. ~~bool~~ |
| `tok2vec` | The [`tok2vec`](#tok2vec) layer of the model. ~~Model~~ |
| `nO` | Output dimension, determined by the number of different labels. If not set, the [`TextCategorizer`](/api/textcategorizer) component will set it when `initialize` is called. ~~Optional[int]~~ |
| **CREATES** | The model using the architecture. ~~Model[List[Doc], Floats2d]~~ |
<Accordion title="spacy.TextCatCNN.v1 definition" spaced>
[TextCatCNN.v1](/api/legacy#TextCatCNN_v1) had the exact same signature, but was
not yet resizable. Since v2, new labels can be added to this component, even
after training.
</Accordion>
### spacy.TextCatBOW.v3 {id="TextCatBOW"}
> #### Example Config

View File

@ -162,7 +162,10 @@ network has an internal CNN Tok2Vec layer and uses attention.
Since `spacy.TextCatCNN.v2`, this architecture has become resizable, which means
that you can add labels to a previously trained textcat. `TextCatCNN` v1 did not
yet support that.
yet support that. `TextCatCNN` has been replaced by the more general
[`TextCatReduce`](/api/architectures#TextCatReduce) layer. `TextCatCNN` is
identical to `TextCatReduce` with `use_reduce_mean=true`,
`use_reduce_first=false` and `use_reduce_max=false`.
> #### Example Config
>
@ -194,6 +197,51 @@ architecture is usually less accurate than the ensemble, but runs faster.
| `nO` | Output dimension, determined by the number of different labels. If not set, the [`TextCategorizer`](/api/textcategorizer) component will set it when `initialize` is called. ~~Optional[int]~~ |
| **CREATES** | The model using the architecture. ~~Model[List[Doc], Floats2d]~~ |
### spacy.TextCatCNN.v2 {id="TextCatCNN_v2"}
> #### Example Config
>
> ```ini
> [model]
> @architectures = "spacy.TextCatCNN.v2"
> exclusive_classes = false
> nO = null
>
> [model.tok2vec]
> @architectures = "spacy.HashEmbedCNN.v2"
> pretrained_vectors = null
> width = 96
> depth = 4
> embed_size = 2000
> window_size = 1
> maxout_pieces = 3
> subword_features = true
> ```
A neural network model where token vectors are calculated using a CNN. The
vectors are mean pooled and used as features in a feed-forward network. This
architecture is usually less accurate than the ensemble, but runs faster.
`TextCatCNN` has been replaced by the more general
[`TextCatReduce`](/api/architectures#TextCatReduce) layer. `TextCatCNN` is
identical to `TextCatReduce` with `use_reduce_mean=true`,
`use_reduce_first=false` and `use_reduce_max=false`.
| Name | Description |
| ------------------- | ---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
| `exclusive_classes` | Whether or not categories are mutually exclusive. ~~bool~~ |
| `tok2vec` | The [`tok2vec`](#tok2vec) layer of the model. ~~Model~~ |
| `nO` | Output dimension, determined by the number of different labels. If not set, the [`TextCategorizer`](/api/textcategorizer) component will set it when `initialize` is called. ~~Optional[int]~~ |
| **CREATES** | The model using the architecture. ~~Model[List[Doc], Floats2d]~~ |
<Accordion title="spacy.TextCatCNN.v1 definition" spaced>
[TextCatCNN.v1](/api/legacy#TextCatCNN_v1) had the exact same signature, but was
not yet resizable. Since v2, new labels can be added to this component, even
after training.
</Accordion>
### spacy.TextCatBOW.v1 {id="TextCatBOW_v1"}
Since `spacy.TextCatBOW.v2`, this architecture has become resizable, which means