mirror of
https://github.com/explosion/spaCy.git
synced 2025-07-06 12:53:19 +03:00
Fix piece_encoder
entries
This commit is contained in:
parent
a282aec814
commit
d8722877cb
|
@ -484,9 +484,9 @@ The other arguments are shared between all versions.
|
||||||
## Curated transformer architectures {id="curated-trf",source="https://github.com/explosion/spacy-curated-transformers/blob/main/spacy_curated_transformers/models/architectures.py"}
|
## Curated transformer architectures {id="curated-trf",source="https://github.com/explosion/spacy-curated-transformers/blob/main/spacy_curated_transformers/models/architectures.py"}
|
||||||
|
|
||||||
The following architectures are provided by the package
|
The following architectures are provided by the package
|
||||||
[`spacy-curated-transformers`](https://github.com/explosion/spacy-curated-transformers). See the
|
[`spacy-curated-transformers`](https://github.com/explosion/spacy-curated-transformers).
|
||||||
[usage documentation](/usage/embeddings-transformers#transformers) for how to
|
See the [usage documentation](/usage/embeddings-transformers#transformers) for
|
||||||
integrate the architectures into your training config.
|
how to integrate the architectures into your training config.
|
||||||
|
|
||||||
<Infobox variant="warning">
|
<Infobox variant="warning">
|
||||||
|
|
||||||
|
@ -503,11 +503,10 @@ for details and system requirements.
|
||||||
Construct an ALBERT transformer model.
|
Construct an ALBERT transformer model.
|
||||||
|
|
||||||
| Name | Description |
|
| Name | Description |
|
||||||
|--------------------------------|-----------------------------------------------------------------------------|
|
| ------------------------------ | --------------------------------------------------------------------------- |
|
||||||
| `vocab_size` | Vocabulary size. ~~int~~ |
|
| `vocab_size` | Vocabulary size. ~~int~~ |
|
||||||
| `with_spans` | Callback that constructs a span generator model. ~~Callable~~ |
|
| `with_spans` | Callback that constructs a span generator model. ~~Callable~~ |
|
||||||
| `with_spans` | piece_encoder (Model) ~~Callable~~ |
|
| `piece_encoder` | The piece encoder to segment input tokens. ~~Model~~ |
|
||||||
| `with_spans` | The piece encoder to segment input tokens. ~~Callable~~ |
|
|
||||||
| `attention_probs_dropout_prob` | Dropout probabilty of the self-attention layers. ~~float~~ |
|
| `attention_probs_dropout_prob` | Dropout probabilty of the self-attention layers. ~~float~~ |
|
||||||
| `embedding_width` | Width of the embedding representations. ~~int~~ |
|
| `embedding_width` | Width of the embedding representations. ~~int~~ |
|
||||||
| `hidden_act` | Activation used by the point-wise feed-forward layers. ~~str~~ |
|
| `hidden_act` | Activation used by the point-wise feed-forward layers. ~~str~~ |
|
||||||
|
@ -533,11 +532,10 @@ Construct an ALBERT transformer model.
|
||||||
Construct a BERT transformer model.
|
Construct a BERT transformer model.
|
||||||
|
|
||||||
| Name | Description |
|
| Name | Description |
|
||||||
|--------------------------------|-----------------------------------------------------------------------------|
|
| ------------------------------ | --------------------------------------------------------------------------- |
|
||||||
| `vocab_size` | Vocabulary size. ~~int~~ |
|
| `vocab_size` | Vocabulary size. ~~int~~ |
|
||||||
| `with_spans` | Callback that constructs a span generator model. ~~Callable~~ |
|
| `with_spans` | Callback that constructs a span generator model. ~~Callable~~ |
|
||||||
| `with_spans` | piece_encoder (Model) ~~Callable~~ |
|
| `piece_encoder` | The piece encoder to segment input tokens. ~~Model~~ |
|
||||||
| `with_spans` | The piece encoder to segment input tokens. ~~Callable~~ |
|
|
||||||
| `attention_probs_dropout_prob` | Dropout probabilty of the self-attention layers. ~~float~~ |
|
| `attention_probs_dropout_prob` | Dropout probabilty of the self-attention layers. ~~float~~ |
|
||||||
| `hidden_act` | Activation used by the point-wise feed-forward layers. ~~str~~ |
|
| `hidden_act` | Activation used by the point-wise feed-forward layers. ~~str~~ |
|
||||||
| `hidden_dropout_prob` | Dropout probabilty of the point-wise feed-forward and ~~float~~ |
|
| `hidden_dropout_prob` | Dropout probabilty of the point-wise feed-forward and ~~float~~ |
|
||||||
|
@ -561,11 +559,10 @@ Construct a BERT transformer model.
|
||||||
Construct a CamemBERT transformer model.
|
Construct a CamemBERT transformer model.
|
||||||
|
|
||||||
| Name | Description |
|
| Name | Description |
|
||||||
|--------------------------------|-----------------------------------------------------------------------------|
|
| ------------------------------ | --------------------------------------------------------------------------- |
|
||||||
| `vocab_size` | Vocabulary size. ~~int~~ |
|
| `vocab_size` | Vocabulary size. ~~int~~ |
|
||||||
| `with_spans` | Callback that constructs a span generator model. ~~Callable~~ |
|
| `with_spans` | Callback that constructs a span generator model. ~~Callable~~ |
|
||||||
| `with_spans` | piece_encoder (Model) ~~Callable~~ |
|
| `piece_encoder` | The piece encoder to segment input tokens. ~~Model~~ |
|
||||||
| `with_spans` | The piece encoder to segment input tokens. ~~Callable~~ |
|
|
||||||
| `attention_probs_dropout_prob` | Dropout probabilty of the self-attention layers. ~~float~~ |
|
| `attention_probs_dropout_prob` | Dropout probabilty of the self-attention layers. ~~float~~ |
|
||||||
| `hidden_act` | Activation used by the point-wise feed-forward layers. ~~str~~ |
|
| `hidden_act` | Activation used by the point-wise feed-forward layers. ~~str~~ |
|
||||||
| `hidden_dropout_prob` | Dropout probabilty of the point-wise feed-forward and ~~float~~ |
|
| `hidden_dropout_prob` | Dropout probabilty of the point-wise feed-forward and ~~float~~ |
|
||||||
|
@ -589,11 +586,10 @@ Construct a CamemBERT transformer model.
|
||||||
Construct a RoBERTa transformer model.
|
Construct a RoBERTa transformer model.
|
||||||
|
|
||||||
| Name | Description |
|
| Name | Description |
|
||||||
|--------------------------------|-----------------------------------------------------------------------------|
|
| ------------------------------ | --------------------------------------------------------------------------- |
|
||||||
| `vocab_size` | Vocabulary size. ~~int~~ |
|
| `vocab_size` | Vocabulary size. ~~int~~ |
|
||||||
| `with_spans` | Callback that constructs a span generator model. ~~Callable~~ |
|
| `with_spans` | Callback that constructs a span generator model. ~~Callable~~ |
|
||||||
| `with_spans` | piece_encoder (Model) ~~Callable~~ |
|
| `piece_encoder` | The piece encoder to segment input tokens. ~~Model~~ |
|
||||||
| `with_spans` | The piece encoder to segment input tokens. ~~Callable~~ |
|
|
||||||
| `attention_probs_dropout_prob` | Dropout probabilty of the self-attention layers. ~~float~~ |
|
| `attention_probs_dropout_prob` | Dropout probabilty of the self-attention layers. ~~float~~ |
|
||||||
| `hidden_act` | Activation used by the point-wise feed-forward layers. ~~str~~ |
|
| `hidden_act` | Activation used by the point-wise feed-forward layers. ~~str~~ |
|
||||||
| `hidden_dropout_prob` | Dropout probabilty of the point-wise feed-forward and ~~float~~ |
|
| `hidden_dropout_prob` | Dropout probabilty of the point-wise feed-forward and ~~float~~ |
|
||||||
|
@ -612,17 +608,15 @@ Construct a RoBERTa transformer model.
|
||||||
| `grad_scaler_config` | Configuration passed to the PyTorch gradient scaler. ~~dict~~ |
|
| `grad_scaler_config` | Configuration passed to the PyTorch gradient scaler. ~~dict~~ |
|
||||||
| **CREATES** | The model using the architecture ~~Model[TransformerInT, TransformerOutT]~~ |
|
| **CREATES** | The model using the architecture ~~Model[TransformerInT, TransformerOutT]~~ |
|
||||||
|
|
||||||
|
|
||||||
### spacy-curated-transformers.XlmrTransformer.v1
|
### spacy-curated-transformers.XlmrTransformer.v1
|
||||||
|
|
||||||
Construct a XLM-RoBERTa transformer model.
|
Construct a XLM-RoBERTa transformer model.
|
||||||
|
|
||||||
| Name | Description |
|
| Name | Description |
|
||||||
|--------------------------------|-----------------------------------------------------------------------------|
|
| ------------------------------ | --------------------------------------------------------------------------- |
|
||||||
| `vocab_size` | Vocabulary size. ~~int~~ |
|
| `vocab_size` | Vocabulary size. ~~int~~ |
|
||||||
| `with_spans` | Callback that constructs a span generator model. ~~Callable~~ |
|
| `with_spans` | Callback that constructs a span generator model. ~~Callable~~ |
|
||||||
| `with_spans` | piece_encoder (Model) ~~Callable~~ |
|
| `piece_encoder` | The piece encoder to segment input tokens. ~~Model~~ |
|
||||||
| `with_spans` | The piece encoder to segment input tokens. ~~Callable~~ |
|
|
||||||
| `attention_probs_dropout_prob` | Dropout probabilty of the self-attention layers. ~~float~~ |
|
| `attention_probs_dropout_prob` | Dropout probabilty of the self-attention layers. ~~float~~ |
|
||||||
| `hidden_act` | Activation used by the point-wise feed-forward layers. ~~str~~ |
|
| `hidden_act` | Activation used by the point-wise feed-forward layers. ~~str~~ |
|
||||||
| `hidden_dropout_prob` | Dropout probabilty of the point-wise feed-forward and ~~float~~ |
|
| `hidden_dropout_prob` | Dropout probabilty of the point-wise feed-forward and ~~float~~ |
|
||||||
|
@ -641,13 +635,13 @@ Construct a XLM-RoBERTa transformer model.
|
||||||
| `grad_scaler_config` | Configuration passed to the PyTorch gradient scaler. ~~dict~~ |
|
| `grad_scaler_config` | Configuration passed to the PyTorch gradient scaler. ~~dict~~ |
|
||||||
| **CREATES** | The model using the architecture ~~Model[TransformerInT, TransformerOutT]~~ |
|
| **CREATES** | The model using the architecture ~~Model[TransformerInT, TransformerOutT]~~ |
|
||||||
|
|
||||||
|
|
||||||
### spacy-curated-transformers.ScalarWeight.v1
|
### spacy-curated-transformers.ScalarWeight.v1
|
||||||
|
|
||||||
Construct a model that accepts a list of transformer layer outputs and returns a weighted representation of the same.
|
Construct a model that accepts a list of transformer layer outputs and returns a
|
||||||
|
weighted representation of the same.
|
||||||
|
|
||||||
| Name | Description |
|
| Name | Description |
|
||||||
|----------------------|-------------------------------------------------------------------------------|
|
| -------------------- | ----------------------------------------------------------------------------- |
|
||||||
| `num_layers` | Number of transformer hidden layers. ~~int~~ |
|
| `num_layers` | Number of transformer hidden layers. ~~int~~ |
|
||||||
| `dropout_prob` | Dropout probability. ~~float~~ |
|
| `dropout_prob` | Dropout probability. ~~float~~ |
|
||||||
| `mixed_precision` | Use mixed-precision training. ~~bool~~ |
|
| `mixed_precision` | Use mixed-precision training. ~~bool~~ |
|
||||||
|
@ -656,137 +650,130 @@ Construct a model that accepts a list of transformer layer outputs and returns a
|
||||||
|
|
||||||
### spacy-curated-transformers.TransformerLayersListener.v1
|
### spacy-curated-transformers.TransformerLayersListener.v1
|
||||||
|
|
||||||
Construct a listener layer that communicates with one or more upstream Transformer
|
Construct a listener layer that communicates with one or more upstream
|
||||||
components. This layer extracts the output of the last transformer layer and performs
|
Transformer components. This layer extracts the output of the last transformer
|
||||||
pooling over the individual pieces of each Doc token, returning their corresponding
|
layer and performs pooling over the individual pieces of each Doc token,
|
||||||
representations. The upstream name should either be the wildcard string '*', or the name of the Transformer component.
|
returning their corresponding representations. The upstream name should either
|
||||||
|
be the wildcard string '\*', or the name of the Transformer component.
|
||||||
|
|
||||||
In almost all cases, the wildcard string will suffice as there'll only be one
|
In almost all cases, the wildcard string will suffice as there'll only be one
|
||||||
upstream Transformer component. But in certain situations, e.g: you have disjoint
|
upstream Transformer component. But in certain situations, e.g: you have
|
||||||
datasets for certain tasks, or you'd like to use a pre-trained pipeline but a
|
disjoint datasets for certain tasks, or you'd like to use a pre-trained pipeline
|
||||||
downstream task requires its own token representations, you could end up with
|
but a downstream task requires its own token representations, you could end up
|
||||||
more than one Transformer component in the pipeline.
|
with more than one Transformer component in the pipeline.
|
||||||
|
|
||||||
|
|
||||||
| Name | Description |
|
| Name | Description |
|
||||||
|-----------------|------------------------------------------------------------------------------------------------------------------------|
|
| --------------- | ---------------------------------------------------------------------------------------------------------------------- |
|
||||||
| `layers` | The the number of layers produced by the upstream transformer component, excluding the embedding layer. ~~int~~ |
|
| `layers` | The the number of layers produced by the upstream transformer component, excluding the embedding layer. ~~int~~ |
|
||||||
| `width` | The width of the vectors produced by the upstream transformer component. ~~int~~ |
|
| `width` | The width of the vectors produced by the upstream transformer component. ~~int~~ |
|
||||||
| `pooling` | Model that is used to perform pooling over the piece representations. ~~Model~~ |
|
| `pooling` | Model that is used to perform pooling over the piece representations. ~~Model~~ |
|
||||||
| `upstream_name` | A string to identify the 'upstream' Transformer component to communicate with. ~~str~~ |
|
| `upstream_name` | A string to identify the 'upstream' Transformer component to communicate with. ~~str~~ |
|
||||||
| `grad_factor` | Factor to multiply gradients with. ~~float~~ |
|
| `grad_factor` | Factor to multiply gradients with. ~~float~~ |
|
||||||
| **CREATES** | A model that returns the relevant vectors from an upstream transformer component. ~~Model[List[Doc], List[Floats2d]]~~ |
|
| **CREATES** | A model that returns the relevant vectors from an upstream transformer component. ~~Model[List[Doc], List[Floats2d]]~~ |
|
||||||
|
|
||||||
|
|
||||||
### spacy-curated-transformers.LastTransformerLayerListener.v1
|
### spacy-curated-transformers.LastTransformerLayerListener.v1
|
||||||
|
|
||||||
Construct a listener layer that communicates with one or more upstream Transformer
|
Construct a listener layer that communicates with one or more upstream
|
||||||
components. This layer extracts the output of the last transformer layer and performs
|
Transformer components. This layer extracts the output of the last transformer
|
||||||
pooling over the individual pieces of each Doc token, returning their corresponding
|
layer and performs pooling over the individual pieces of each Doc token,
|
||||||
representations. The upstream name should either be the wildcard string '*', or the name of the Transformer component.
|
returning their corresponding representations. The upstream name should either
|
||||||
|
be the wildcard string '\*', or the name of the Transformer component.
|
||||||
|
|
||||||
In almost all cases, the wildcard string will suffice as there'll only be one
|
In almost all cases, the wildcard string will suffice as there'll only be one
|
||||||
upstream Transformer component. But in certain situations, e.g: you have disjoint
|
upstream Transformer component. But in certain situations, e.g: you have
|
||||||
datasets for certain tasks, or you'd like to use a pre-trained pipeline but a
|
disjoint datasets for certain tasks, or you'd like to use a pre-trained pipeline
|
||||||
downstream task requires its own token representations, you could end up with
|
but a downstream task requires its own token representations, you could end up
|
||||||
more than one Transformer component in the pipeline.
|
with more than one Transformer component in the pipeline.
|
||||||
|
|
||||||
| Name | Description |
|
| Name | Description |
|
||||||
|-----------------|------------------------------------------------------------------------------------------------------------------------|
|
| --------------- | ---------------------------------------------------------------------------------------------------------------------- |
|
||||||
| `width` | The width of the vectors produced by the upstream transformer component. ~~int~~ |
|
| `width` | The width of the vectors produced by the upstream transformer component. ~~int~~ |
|
||||||
| `pooling` | Model that is used to perform pooling over the piece representations. ~~Model~~ |
|
| `pooling` | Model that is used to perform pooling over the piece representations. ~~Model~~ |
|
||||||
| `upstream_name` | A string to identify the 'upstream' Transformer component to communicate with. ~~str~~ |
|
| `upstream_name` | A string to identify the 'upstream' Transformer component to communicate with. ~~str~~ |
|
||||||
| `grad_factor` | Factor to multiply gradients with. ~~float~~ |
|
| `grad_factor` | Factor to multiply gradients with. ~~float~~ |
|
||||||
| **CREATES** | A model that returns the relevant vectors from an upstream transformer component. ~~Model[List[Doc], List[Floats2d]]~~ |
|
| **CREATES** | A model that returns the relevant vectors from an upstream transformer component. ~~Model[List[Doc], List[Floats2d]]~~ |
|
||||||
|
|
||||||
|
|
||||||
### spacy-curated-transformers.ScalarWeightingListener.v1
|
### spacy-curated-transformers.ScalarWeightingListener.v1
|
||||||
|
|
||||||
Construct a listener layer that communicates with one or more upstream Transformer
|
Construct a listener layer that communicates with one or more upstream
|
||||||
components. This layer calculates a weighted representation of all transformer layer
|
Transformer components. This layer calculates a weighted representation of all
|
||||||
outputs and performs pooling over the individual pieces of each Doc token, returning
|
transformer layer outputs and performs pooling over the individual pieces of
|
||||||
their corresponding representations.
|
each Doc token, returning their corresponding representations.
|
||||||
|
|
||||||
Requires its upstream Transformer components to return all layer outputs from
|
Requires its upstream Transformer components to return all layer outputs from
|
||||||
their models. The upstream name should either be the wildcard string '*', or the name of the Transformer component.
|
their models. The upstream name should either be the wildcard string '\*', or
|
||||||
|
the name of the Transformer component.
|
||||||
|
|
||||||
In almost all cases, the wildcard string will suffice as there'll only be one
|
In almost all cases, the wildcard string will suffice as there'll only be one
|
||||||
upstream Transformer component. But in certain situations, e.g: you have disjoint
|
upstream Transformer component. But in certain situations, e.g: you have
|
||||||
datasets for certain tasks, or you'd like to use a pre-trained pipeline but a
|
disjoint datasets for certain tasks, or you'd like to use a pre-trained pipeline
|
||||||
downstream task requires its own token representations, you could end up with
|
but a downstream task requires its own token representations, you could end up
|
||||||
more than one Transformer component in the pipeline.
|
with more than one Transformer component in the pipeline.
|
||||||
|
|
||||||
| Name | Description |
|
| Name | Description |
|
||||||
|-----------------|------------------------------------------------------------------------------------------------------------------------|
|
| --------------- | ---------------------------------------------------------------------------------------------------------------------- |
|
||||||
| `width` | The width of the vectors produced by the upstream transformer component. ~~int~~ |
|
| `width` | The width of the vectors produced by the upstream transformer component. ~~int~~ |
|
||||||
| `weighting` | Model that is used to perform the weighting of the different layer outputs. ~~Model~~ |
|
| `weighting` | Model that is used to perform the weighting of the different layer outputs. ~~Model~~ |
|
||||||
| `pooling` | Model that is used to perform pooling over the piece representations. ~~Model~~ |
|
| `pooling` | Model that is used to perform pooling over the piece representations. ~~Model~~ |
|
||||||
| `upstream_name` | A string to identify the 'upstream' Transformer component to communicate with. ~~str~~ |
|
| `upstream_name` | A string to identify the 'upstream' Transformer component to communicate with. ~~str~~ |
|
||||||
| `grad_factor` | Factor to multiply gradients with. ~~float~~ |
|
| `grad_factor` | Factor to multiply gradients with. ~~float~~ |
|
||||||
| **CREATES** | A model that returns the relevant vectors from an upstream transformer component. ~~Model[List[Doc], List[Floats2d]]~~ |
|
| **CREATES** | A model that returns the relevant vectors from an upstream transformer component. ~~Model[List[Doc], List[Floats2d]]~~ |
|
||||||
|
|
||||||
### spacy-curated-transformers.BertWordpieceEncoder.v1
|
### spacy-curated-transformers.BertWordpieceEncoder.v1
|
||||||
|
|
||||||
Construct a WordPiece piece encoder model that accepts a list
|
Construct a WordPiece piece encoder model that accepts a list of token sequences
|
||||||
of token sequences or documents and returns a corresponding list
|
or documents and returns a corresponding list of piece identifiers. This encoder
|
||||||
of piece identifiers. This encoder also splits each token
|
also splits each token on punctuation characters, as expected by most BERT
|
||||||
on punctuation characters, as expected by most BERT models.
|
models.
|
||||||
|
|
||||||
This model must be separately initialized using an appropriate
|
This model must be separately initialized using an appropriate loader.
|
||||||
loader.
|
|
||||||
|
|
||||||
### spacy-curated-transformers.ByteBpeEncoder.v1
|
### spacy-curated-transformers.ByteBpeEncoder.v1
|
||||||
|
|
||||||
Construct a Byte-BPE piece encoder model that accepts a list
|
Construct a Byte-BPE piece encoder model that accepts a list of token sequences
|
||||||
of token sequences or documents and returns a corresponding list
|
or documents and returns a corresponding list of piece identifiers.
|
||||||
of piece identifiers.
|
|
||||||
|
|
||||||
This model must be separately initialized using an appropriate
|
This model must be separately initialized using an appropriate loader.
|
||||||
loader.
|
|
||||||
|
|
||||||
### spacy-curated-transformers.CamembertSentencepieceEncoder.v1
|
### spacy-curated-transformers.CamembertSentencepieceEncoder.v1
|
||||||
Construct a SentencePiece piece encoder model that accepts a list
|
|
||||||
of token sequences or documents and returns a corresponding list
|
|
||||||
of piece identifiers with CamemBERT post-processing applied.
|
|
||||||
|
|
||||||
This model must be separately initialized using an appropriate
|
Construct a SentencePiece piece encoder model that accepts a list of token
|
||||||
loader.
|
sequences or documents and returns a corresponding list of piece identifiers
|
||||||
|
with CamemBERT post-processing applied.
|
||||||
|
|
||||||
|
This model must be separately initialized using an appropriate loader.
|
||||||
|
|
||||||
### spacy-curated-transformers.CharEncoder.v1
|
### spacy-curated-transformers.CharEncoder.v1
|
||||||
Construct a character piece encoder model that accepts a list
|
|
||||||
of token sequences or documents and returns a corresponding list
|
|
||||||
of piece identifiers.
|
|
||||||
|
|
||||||
This model must be separately initialized using an appropriate
|
Construct a character piece encoder model that accepts a list of token sequences
|
||||||
loader.
|
or documents and returns a corresponding list of piece identifiers.
|
||||||
|
|
||||||
|
This model must be separately initialized using an appropriate loader.
|
||||||
|
|
||||||
### spacy-curated-transformers.SentencepieceEncoder.v1
|
### spacy-curated-transformers.SentencepieceEncoder.v1
|
||||||
Construct a SentencePiece piece encoder model that accepts a list
|
|
||||||
of token sequences or documents and returns a corresponding list
|
|
||||||
of piece identifiers with CamemBERT post-processing applied.
|
|
||||||
|
|
||||||
This model must be separately initialized using an appropriate
|
Construct a SentencePiece piece encoder model that accepts a list of token
|
||||||
loader.
|
sequences or documents and returns a corresponding list of piece identifiers
|
||||||
|
with CamemBERT post-processing applied.
|
||||||
|
|
||||||
|
This model must be separately initialized using an appropriate loader.
|
||||||
|
|
||||||
### spacy-curated-transformers.WordpieceEncoder.v1
|
### spacy-curated-transformers.WordpieceEncoder.v1
|
||||||
Construct a WordPiece piece encoder model that accepts a list
|
|
||||||
of token sequences or documents and returns a corresponding list
|
|
||||||
of piece identifiers. This encoder also splits each token
|
|
||||||
on punctuation characters, as expected by most BERT models.
|
|
||||||
|
|
||||||
This model must be separately initialized using an appropriate
|
Construct a WordPiece piece encoder model that accepts a list of token sequences
|
||||||
loader.
|
or documents and returns a corresponding list of piece identifiers. This encoder
|
||||||
|
also splits each token on punctuation characters, as expected by most BERT
|
||||||
|
models.
|
||||||
|
|
||||||
|
This model must be separately initialized using an appropriate loader.
|
||||||
|
|
||||||
### spacy-curated-transformers.XlmrSentencepieceEncoder.v1
|
### spacy-curated-transformers.XlmrSentencepieceEncoder.v1
|
||||||
Construct a SentencePiece piece encoder model that accepts a list
|
|
||||||
of token sequences or documents and returns a corresponding list
|
|
||||||
of piece identifiers with XLM-RoBERTa post-processing applied.
|
|
||||||
|
|
||||||
This model must be separately initialized using an appropriate
|
|
||||||
loader.
|
|
||||||
|
|
||||||
|
|
||||||
|
Construct a SentencePiece piece encoder model that accepts a list of token
|
||||||
|
sequences or documents and returns a corresponding list of piece identifiers
|
||||||
|
with XLM-RoBERTa post-processing applied.
|
||||||
|
|
||||||
|
This model must be separately initialized using an appropriate loader.
|
||||||
|
|
||||||
## Pretraining architectures {id="pretrain",source="spacy/ml/models/multi_task.py"}
|
## Pretraining architectures {id="pretrain",source="spacy/ml/models/multi_task.py"}
|
||||||
|
|
||||||
|
@ -826,7 +813,7 @@ objective for a Tok2Vec layer. To use this objective, make sure that the
|
||||||
vectors.
|
vectors.
|
||||||
|
|
||||||
| Name | Description |
|
| Name | Description |
|
||||||
|-----------------|-----------------------------------------------------------------------------------------------------------------------------------------------------------|
|
| --------------- | --------------------------------------------------------------------------------------------------------------------------------------------------------- |
|
||||||
| `maxout_pieces` | The number of maxout pieces to use. Recommended values are `2` or `3`. ~~int~~ |
|
| `maxout_pieces` | The number of maxout pieces to use. Recommended values are `2` or `3`. ~~int~~ |
|
||||||
| `hidden_size` | Size of the hidden layer of the model. ~~int~~ |
|
| `hidden_size` | Size of the hidden layer of the model. ~~int~~ |
|
||||||
| `loss` | The loss function can be either "cosine" or "L2". We typically recommend to use "cosine". ~~~str~~ |
|
| `loss` | The loss function can be either "cosine" or "L2". We typically recommend to use "cosine". ~~~str~~ |
|
||||||
|
|
Loading…
Reference in New Issue
Block a user