diff --git a/spacy/pipeline/span_predictor.py b/spacy/pipeline/span_predictor.py index d7e96a4b2..99a1f7ef6 100644 --- a/spacy/pipeline/span_predictor.py +++ b/spacy/pipeline/span_predictor.py @@ -29,7 +29,7 @@ distance_embedding_size = 64 conv_channels = 4 window_size = 1 max_distance = 128 -prefix = coref_head_clusters +prefix = "coref_head_clusters" [model.tok2vec] @architectures = "spacy.Tok2Vec.v2" diff --git a/website/docs/api/architectures.md b/website/docs/api/architectures.md index 4e70eee87..e881864a9 100644 --- a/website/docs/api/architectures.md +++ b/website/docs/api/architectures.md @@ -587,8 +587,8 @@ consists of either two or three subnetworks: run once for each batch. - **lower**: Construct a feature-specific vector for each `(token, feature)` pair. This is also run once for each batch. Constructing the state - representation is then a matter of summing the component features and - applying the non-linearity. + representation is then a matter of summing the component features and applying + the non-linearity. - **upper** (optional): A feed-forward network that predicts scores from the state representation. If not present, the output from the lower model is used as action scores directly. @@ -628,8 +628,8 @@ same signature, but the `use_upper` argument was `True` by default. > ``` Build a tagger model, using a provided token-to-vector component. The tagger -model adds a linear layer with softmax activation to predict scores given -the token vectors. +model adds a linear layer with softmax activation to predict scores given the +token vectors. | Name | Description | | ----------- | ------------------------------------------------------------------------------------------ | @@ -920,8 +920,8 @@ A function that reads an existing `KnowledgeBase` from file. A function that takes as input a [`KnowledgeBase`](/api/kb) and a [`Span`](/api/span) object denoting a named entity, and returns a list of plausible [`Candidate`](/api/kb/#candidate) objects. The default -`CandidateGenerator` uses the text of a mention to find its potential -aliases in the `KnowledgeBase`. Note that this function is case-dependent. +`CandidateGenerator` uses the text of a mention to find its potential aliases in +the `KnowledgeBase`. Note that this function is case-dependent. ## Coreference Architectures @@ -975,7 +975,11 @@ The `Coref` model architecture is a Thinc `Model`. > [model] > @architectures = "spacy.SpanPredictor.v1" > hidden_size = 1024 -> dist_emb_size = 64 +> distance_embedding_size = 64 +> conv_channels = 4 +> window_size = 1 +> max_distance = 128 +> prefix = "coref_head_clusters" > > [model.tok2vec] > @architectures = "spacy-transformers.TransformerListener.v1" @@ -986,13 +990,14 @@ The `Coref` model architecture is a Thinc `Model`. The `SpanPredictor` model architecture is a Thinc `Model`. -| Name | Description | -| ------------------------- | ---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | -| `tok2vec` | The [`tok2vec`](#tok2vec) layer of the model. ~~Model~~ | -| `distance_embedding_size` | A representation of the distance between two candidates. ~~int~~ | -| `dropout` | The dropout to use internally. Unlike some Thinc models, this has separate dropout for the internal PyTorch layers. ~~float~~ | -| `hidden_size` | Size of the main internal layers. ~~int~~ | -| `depth` | Depth of the internal network. ~~int~~ | -| `antecedent_limit` | How many candidate antecedents to keep after rough scoring. This has a significant effect on memory usage. Typical values would be 50 to 200, or higher for very long documents. ~~int~~ | -| `antecedent_batch_size` | Internal batch size. ~~int~~ | -| **CREATES** | The model using the architecture. ~~Model[List[Doc], TupleFloats2d]~~ | +| Name | Description | +| ------------------------- | ----------------------------------------------------------------------------------------------------------------------------- | +| `tok2vec` | The [`tok2vec`](#tok2vec) layer of the model. ~~Model~~ | +| `distance_embedding_size` | A representation of the distance between two candidates. ~~int~~ | +| `dropout` | The dropout to use internally. Unlike some Thinc models, this has separate dropout for the internal PyTorch layers. ~~float~~ | +| `hidden_size` | Size of the main internal layers. ~~int~~ | +| `conv_channels` | The number of channels in the internal CNN. ~~int~~ | +| `window_size` | The number of neighboring tokens to consider in the internal CNN. `1` means consider one token on each side. ~~int~~ | +| `max_distance` | The longest possible length of a predicted span. ~~int~~ | +| `prefix` | The prefix that indicates spans to use for input data. ~~string~~ | +| **CREATES** | The model using the architecture. ~~Model[List[Doc], TupleFloats2d]~~ |