Update docs

Parameter names in architecture docs were not updated after parameters
were renamed.
This commit is contained in:
Paul O'Leary McCann 2022-07-06 17:13:31 +09:00
parent c59aeeb0ae
commit da9c379355
2 changed files with 23 additions and 18 deletions

View File

@ -29,7 +29,7 @@ distance_embedding_size = 64
conv_channels = 4 conv_channels = 4
window_size = 1 window_size = 1
max_distance = 128 max_distance = 128
prefix = coref_head_clusters prefix = "coref_head_clusters"
[model.tok2vec] [model.tok2vec]
@architectures = "spacy.Tok2Vec.v2" @architectures = "spacy.Tok2Vec.v2"

View File

@ -587,8 +587,8 @@ consists of either two or three subnetworks:
run once for each batch. run once for each batch.
- **lower**: Construct a feature-specific vector for each `(token, feature)` - **lower**: Construct a feature-specific vector for each `(token, feature)`
pair. This is also run once for each batch. Constructing the state pair. This is also run once for each batch. Constructing the state
representation is then a matter of summing the component features and representation is then a matter of summing the component features and applying
applying the non-linearity. the non-linearity.
- **upper** (optional): A feed-forward network that predicts scores from the - **upper** (optional): A feed-forward network that predicts scores from the
state representation. If not present, the output from the lower model is used state representation. If not present, the output from the lower model is used
as action scores directly. as action scores directly.
@ -628,8 +628,8 @@ same signature, but the `use_upper` argument was `True` by default.
> ``` > ```
Build a tagger model, using a provided token-to-vector component. The tagger Build a tagger model, using a provided token-to-vector component. The tagger
model adds a linear layer with softmax activation to predict scores given model adds a linear layer with softmax activation to predict scores given the
the token vectors. token vectors.
| Name | Description | | Name | Description |
| ----------- | ------------------------------------------------------------------------------------------ | | ----------- | ------------------------------------------------------------------------------------------ |
@ -920,8 +920,8 @@ A function that reads an existing `KnowledgeBase` from file.
A function that takes as input a [`KnowledgeBase`](/api/kb) and a A function that takes as input a [`KnowledgeBase`](/api/kb) and a
[`Span`](/api/span) object denoting a named entity, and returns a list of [`Span`](/api/span) object denoting a named entity, and returns a list of
plausible [`Candidate`](/api/kb/#candidate) objects. The default plausible [`Candidate`](/api/kb/#candidate) objects. The default
`CandidateGenerator` uses the text of a mention to find its potential `CandidateGenerator` uses the text of a mention to find its potential aliases in
aliases in the `KnowledgeBase`. Note that this function is case-dependent. the `KnowledgeBase`. Note that this function is case-dependent.
## Coreference Architectures ## Coreference Architectures
@ -975,7 +975,11 @@ The `Coref` model architecture is a Thinc `Model`.
> [model] > [model]
> @architectures = "spacy.SpanPredictor.v1" > @architectures = "spacy.SpanPredictor.v1"
> hidden_size = 1024 > hidden_size = 1024
> dist_emb_size = 64 > distance_embedding_size = 64
> conv_channels = 4
> window_size = 1
> max_distance = 128
> prefix = "coref_head_clusters"
> >
> [model.tok2vec] > [model.tok2vec]
> @architectures = "spacy-transformers.TransformerListener.v1" > @architectures = "spacy-transformers.TransformerListener.v1"
@ -986,13 +990,14 @@ The `Coref` model architecture is a Thinc `Model`.
The `SpanPredictor` model architecture is a Thinc `Model`. The `SpanPredictor` model architecture is a Thinc `Model`.
| Name | Description | | Name | Description |
| ------------------------- | ---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | | ------------------------- | ----------------------------------------------------------------------------------------------------------------------------- |
| `tok2vec` | The [`tok2vec`](#tok2vec) layer of the model. ~~Model~~ | | `tok2vec` | The [`tok2vec`](#tok2vec) layer of the model. ~~Model~~ |
| `distance_embedding_size` | A representation of the distance between two candidates. ~~int~~ | | `distance_embedding_size` | A representation of the distance between two candidates. ~~int~~ |
| `dropout` | The dropout to use internally. Unlike some Thinc models, this has separate dropout for the internal PyTorch layers. ~~float~~ | | `dropout` | The dropout to use internally. Unlike some Thinc models, this has separate dropout for the internal PyTorch layers. ~~float~~ |
| `hidden_size` | Size of the main internal layers. ~~int~~ | | `hidden_size` | Size of the main internal layers. ~~int~~ |
| `depth` | Depth of the internal network. ~~int~~ | | `conv_channels` | The number of channels in the internal CNN. ~~int~~ |
| `antecedent_limit` | How many candidate antecedents to keep after rough scoring. This has a significant effect on memory usage. Typical values would be 50 to 200, or higher for very long documents. ~~int~~ | | `window_size` | The number of neighboring tokens to consider in the internal CNN. `1` means consider one token on each side. ~~int~~ |
| `antecedent_batch_size` | Internal batch size. ~~int~~ | | `max_distance` | The longest possible length of a predicted span. ~~int~~ |
| **CREATES** | The model using the architecture. ~~Model[List[Doc], TupleFloats2d]~~ | | `prefix` | The prefix that indicates spans to use for input data. ~~string~~ |
| **CREATES** | The model using the architecture. ~~Model[List[Doc], TupleFloats2d]~~ |