mirror of
				https://github.com/explosion/spaCy.git
				synced 2025-10-30 23:47:31 +03:00 
			
		
		
		
	Fix piece_encoder entries
				
					
				
			This commit is contained in:
		
							parent
							
								
									a282aec814
								
							
						
					
					
						commit
						d8722877cb
					
				|  | @ -484,9 +484,9 @@ The other arguments are shared between all versions. | ||||||
| ## Curated transformer architectures {id="curated-trf",source="https://github.com/explosion/spacy-curated-transformers/blob/main/spacy_curated_transformers/models/architectures.py"} | ## Curated transformer architectures {id="curated-trf",source="https://github.com/explosion/spacy-curated-transformers/blob/main/spacy_curated_transformers/models/architectures.py"} | ||||||
| 
 | 
 | ||||||
| The following architectures are provided by the package | The following architectures are provided by the package | ||||||
| [`spacy-curated-transformers`](https://github.com/explosion/spacy-curated-transformers). See the | [`spacy-curated-transformers`](https://github.com/explosion/spacy-curated-transformers). | ||||||
| [usage documentation](/usage/embeddings-transformers#transformers) for how to | See the [usage documentation](/usage/embeddings-transformers#transformers) for | ||||||
| integrate the architectures into your training config. | how to integrate the architectures into your training config. | ||||||
| 
 | 
 | ||||||
| <Infobox variant="warning"> | <Infobox variant="warning"> | ||||||
| 
 | 
 | ||||||
|  | @ -503,11 +503,10 @@ for details and system requirements. | ||||||
| Construct an ALBERT transformer model. | Construct an ALBERT transformer model. | ||||||
| 
 | 
 | ||||||
| | Name                           | Description                                                                 | | | Name                           | Description                                                                 | | ||||||
| |--------------------------------|-----------------------------------------------------------------------------| | | ------------------------------ | --------------------------------------------------------------------------- | | ||||||
| | `vocab_size`                   | Vocabulary size. ~~int~~                                                    | | | `vocab_size`                   | Vocabulary size. ~~int~~                                                    | | ||||||
| | `with_spans`                   | Callback that constructs a span generator model. ~~Callable~~               | | | `with_spans`                   | Callback that constructs a span generator model. ~~Callable~~               | | ||||||
| | `with_spans`                   | piece_encoder (Model) ~~Callable~~                                          | | | `piece_encoder`                | The piece encoder to segment input tokens. ~~Model~~                        | | ||||||
| | `with_spans`                   | The piece encoder to segment input tokens. ~~Callable~~                     | |  | ||||||
| | `attention_probs_dropout_prob` | Dropout probabilty of the self-attention layers. ~~float~~                  | | | `attention_probs_dropout_prob` | Dropout probabilty of the self-attention layers. ~~float~~                  | | ||||||
| | `embedding_width`              | Width of the embedding representations. ~~int~~                             | | | `embedding_width`              | Width of the embedding representations. ~~int~~                             | | ||||||
| | `hidden_act`                   | Activation used by the point-wise feed-forward layers. ~~str~~              | | | `hidden_act`                   | Activation used by the point-wise feed-forward layers. ~~str~~              | | ||||||
|  | @ -533,11 +532,10 @@ Construct an ALBERT transformer model. | ||||||
| Construct a BERT transformer model. | Construct a BERT transformer model. | ||||||
| 
 | 
 | ||||||
| | Name                           | Description                                                                 | | | Name                           | Description                                                                 | | ||||||
| |--------------------------------|-----------------------------------------------------------------------------| | | ------------------------------ | --------------------------------------------------------------------------- | | ||||||
| | `vocab_size`                   | Vocabulary size. ~~int~~                                                    | | | `vocab_size`                   | Vocabulary size. ~~int~~                                                    | | ||||||
| | `with_spans`                   | Callback that constructs a span generator model. ~~Callable~~               | | | `with_spans`                   | Callback that constructs a span generator model. ~~Callable~~               | | ||||||
| | `with_spans`                   | piece_encoder (Model) ~~Callable~~                                          | | | `piece_encoder`                | The piece encoder to segment input tokens. ~~Model~~                        | | ||||||
| | `with_spans`                   | The piece encoder to segment input tokens. ~~Callable~~                     | |  | ||||||
| | `attention_probs_dropout_prob` | Dropout probabilty of the self-attention layers. ~~float~~                  | | | `attention_probs_dropout_prob` | Dropout probabilty of the self-attention layers. ~~float~~                  | | ||||||
| | `hidden_act`                   | Activation used by the point-wise feed-forward layers. ~~str~~              | | | `hidden_act`                   | Activation used by the point-wise feed-forward layers. ~~str~~              | | ||||||
| | `hidden_dropout_prob`          | Dropout probabilty of the point-wise feed-forward and ~~float~~             | | | `hidden_dropout_prob`          | Dropout probabilty of the point-wise feed-forward and ~~float~~             | | ||||||
|  | @ -561,11 +559,10 @@ Construct a BERT transformer model. | ||||||
| Construct a CamemBERT transformer model. | Construct a CamemBERT transformer model. | ||||||
| 
 | 
 | ||||||
| | Name                           | Description                                                                 | | | Name                           | Description                                                                 | | ||||||
| |--------------------------------|-----------------------------------------------------------------------------| | | ------------------------------ | --------------------------------------------------------------------------- | | ||||||
| | `vocab_size`                   | Vocabulary size. ~~int~~                                                    | | | `vocab_size`                   | Vocabulary size. ~~int~~                                                    | | ||||||
| | `with_spans`                   | Callback that constructs a span generator model. ~~Callable~~               | | | `with_spans`                   | Callback that constructs a span generator model. ~~Callable~~               | | ||||||
| | `with_spans`                   | piece_encoder (Model) ~~Callable~~                                          | | | `piece_encoder`                | The piece encoder to segment input tokens. ~~Model~~                        | | ||||||
| | `with_spans`                   | The piece encoder to segment input tokens. ~~Callable~~                     | |  | ||||||
| | `attention_probs_dropout_prob` | Dropout probabilty of the self-attention layers. ~~float~~                  | | | `attention_probs_dropout_prob` | Dropout probabilty of the self-attention layers. ~~float~~                  | | ||||||
| | `hidden_act`                   | Activation used by the point-wise feed-forward layers. ~~str~~              | | | `hidden_act`                   | Activation used by the point-wise feed-forward layers. ~~str~~              | | ||||||
| | `hidden_dropout_prob`          | Dropout probabilty of the point-wise feed-forward and ~~float~~             | | | `hidden_dropout_prob`          | Dropout probabilty of the point-wise feed-forward and ~~float~~             | | ||||||
|  | @ -589,11 +586,10 @@ Construct a CamemBERT transformer model. | ||||||
| Construct a RoBERTa transformer model. | Construct a RoBERTa transformer model. | ||||||
| 
 | 
 | ||||||
| | Name                           | Description                                                                 | | | Name                           | Description                                                                 | | ||||||
| |--------------------------------|-----------------------------------------------------------------------------| | | ------------------------------ | --------------------------------------------------------------------------- | | ||||||
| | `vocab_size`                   | Vocabulary size. ~~int~~                                                    | | | `vocab_size`                   | Vocabulary size. ~~int~~                                                    | | ||||||
| | `with_spans`                   | Callback that constructs a span generator model. ~~Callable~~               | | | `with_spans`                   | Callback that constructs a span generator model. ~~Callable~~               | | ||||||
| | `with_spans`                   | piece_encoder (Model) ~~Callable~~                                          | | | `piece_encoder`                | The piece encoder to segment input tokens. ~~Model~~                        | | ||||||
| | `with_spans`                   | The piece encoder to segment input tokens. ~~Callable~~                     | |  | ||||||
| | `attention_probs_dropout_prob` | Dropout probabilty of the self-attention layers. ~~float~~                  | | | `attention_probs_dropout_prob` | Dropout probabilty of the self-attention layers. ~~float~~                  | | ||||||
| | `hidden_act`                   | Activation used by the point-wise feed-forward layers. ~~str~~              | | | `hidden_act`                   | Activation used by the point-wise feed-forward layers. ~~str~~              | | ||||||
| | `hidden_dropout_prob`          | Dropout probabilty of the point-wise feed-forward and ~~float~~             | | | `hidden_dropout_prob`          | Dropout probabilty of the point-wise feed-forward and ~~float~~             | | ||||||
|  | @ -612,17 +608,15 @@ Construct a RoBERTa transformer model. | ||||||
| | `grad_scaler_config`           | Configuration passed to the PyTorch gradient scaler. ~~dict~~               | | | `grad_scaler_config`           | Configuration passed to the PyTorch gradient scaler. ~~dict~~               | | ||||||
| | **CREATES**                    | The model using the architecture ~~Model[TransformerInT, TransformerOutT]~~ | | | **CREATES**                    | The model using the architecture ~~Model[TransformerInT, TransformerOutT]~~ | | ||||||
| 
 | 
 | ||||||
| 
 |  | ||||||
| ### spacy-curated-transformers.XlmrTransformer.v1 | ### spacy-curated-transformers.XlmrTransformer.v1 | ||||||
| 
 | 
 | ||||||
| Construct a XLM-RoBERTa transformer model. | Construct a XLM-RoBERTa transformer model. | ||||||
| 
 | 
 | ||||||
| | Name                           | Description                                                                 | | | Name                           | Description                                                                 | | ||||||
| |--------------------------------|-----------------------------------------------------------------------------| | | ------------------------------ | --------------------------------------------------------------------------- | | ||||||
| | `vocab_size`                   | Vocabulary size. ~~int~~                                                    | | | `vocab_size`                   | Vocabulary size. ~~int~~                                                    | | ||||||
| | `with_spans`                   | Callback that constructs a span generator model. ~~Callable~~               | | | `with_spans`                   | Callback that constructs a span generator model. ~~Callable~~               | | ||||||
| | `with_spans`                   | piece_encoder (Model) ~~Callable~~                                          | | | `piece_encoder`                | The piece encoder to segment input tokens. ~~Model~~                        | | ||||||
| | `with_spans`                   | The piece encoder to segment input tokens. ~~Callable~~                     | |  | ||||||
| | `attention_probs_dropout_prob` | Dropout probabilty of the self-attention layers. ~~float~~                  | | | `attention_probs_dropout_prob` | Dropout probabilty of the self-attention layers. ~~float~~                  | | ||||||
| | `hidden_act`                   | Activation used by the point-wise feed-forward layers. ~~str~~              | | | `hidden_act`                   | Activation used by the point-wise feed-forward layers. ~~str~~              | | ||||||
| | `hidden_dropout_prob`          | Dropout probabilty of the point-wise feed-forward and ~~float~~             | | | `hidden_dropout_prob`          | Dropout probabilty of the point-wise feed-forward and ~~float~~             | | ||||||
|  | @ -641,13 +635,13 @@ Construct a XLM-RoBERTa transformer model. | ||||||
| | `grad_scaler_config`           | Configuration passed to the PyTorch gradient scaler. ~~dict~~               | | | `grad_scaler_config`           | Configuration passed to the PyTorch gradient scaler. ~~dict~~               | | ||||||
| | **CREATES**                    | The model using the architecture ~~Model[TransformerInT, TransformerOutT]~~ | | | **CREATES**                    | The model using the architecture ~~Model[TransformerInT, TransformerOutT]~~ | | ||||||
| 
 | 
 | ||||||
| 
 |  | ||||||
| ### spacy-curated-transformers.ScalarWeight.v1 | ### spacy-curated-transformers.ScalarWeight.v1 | ||||||
| 
 | 
 | ||||||
| Construct a model that accepts a list of transformer layer outputs and returns a weighted representation of the same. | Construct a model that accepts a list of transformer layer outputs and returns a | ||||||
|  | weighted representation of the same. | ||||||
| 
 | 
 | ||||||
| | Name                 | Description                                                                   | | | Name                 | Description                                                                   | | ||||||
| |----------------------|-------------------------------------------------------------------------------| | | -------------------- | ----------------------------------------------------------------------------- | | ||||||
| | `num_layers`         | Number of transformer hidden layers. ~~int~~                                  | | | `num_layers`         | Number of transformer hidden layers. ~~int~~                                  | | ||||||
| | `dropout_prob`       | Dropout probability. ~~float~~                                                | | | `dropout_prob`       | Dropout probability. ~~float~~                                                | | ||||||
| | `mixed_precision`    | Use mixed-precision training. ~~bool~~                                        | | | `mixed_precision`    | Use mixed-precision training. ~~bool~~                                        | | ||||||
|  | @ -656,137 +650,130 @@ Construct a model that accepts a list of transformer layer outputs and returns a | ||||||
| 
 | 
 | ||||||
| ### spacy-curated-transformers.TransformerLayersListener.v1 | ### spacy-curated-transformers.TransformerLayersListener.v1 | ||||||
| 
 | 
 | ||||||
| Construct a listener layer that communicates with one or more upstream Transformer | Construct a listener layer that communicates with one or more upstream | ||||||
| components. This layer extracts the output of the last transformer layer and performs | Transformer components. This layer extracts the output of the last transformer | ||||||
| pooling over the individual pieces of each Doc token, returning their corresponding | layer and performs pooling over the individual pieces of each Doc token, | ||||||
| representations. The upstream name should either be the wildcard string '*', or the name of the Transformer component. | returning their corresponding representations. The upstream name should either | ||||||
|  | be the wildcard string '\*', or the name of the Transformer component. | ||||||
| 
 | 
 | ||||||
| In almost all cases, the wildcard string will suffice as there'll only be one | In almost all cases, the wildcard string will suffice as there'll only be one | ||||||
| upstream Transformer component. But in certain situations, e.g: you have disjoint | upstream Transformer component. But in certain situations, e.g: you have | ||||||
| datasets for certain tasks, or you'd like to use a pre-trained pipeline but a | disjoint datasets for certain tasks, or you'd like to use a pre-trained pipeline | ||||||
| downstream task requires its own token representations, you could end up with | but a downstream task requires its own token representations, you could end up | ||||||
| more than one Transformer component in the pipeline. | with more than one Transformer component in the pipeline. | ||||||
| 
 |  | ||||||
| 
 | 
 | ||||||
| | Name            | Description                                                                                                            | | | Name            | Description                                                                                                            | | ||||||
| |-----------------|------------------------------------------------------------------------------------------------------------------------| | | --------------- | ---------------------------------------------------------------------------------------------------------------------- | | ||||||
| | `layers`        | The the number of layers produced by the upstream transformer component, excluding the embedding layer. ~~int~~        | | | `layers`        | The the number of layers produced by the upstream transformer component, excluding the embedding layer. ~~int~~        | | ||||||
| | `width`         | The width of the vectors produced by the upstream transformer component. ~~int~~                                       | | | `width`         | The width of the vectors produced by the upstream transformer component. ~~int~~                                       | | ||||||
| | `pooling`       | Model that is used to perform pooling over the piece representations. ~~Model~~                                        | | | `pooling`       | Model that is used to perform pooling over the piece representations. ~~Model~~                                        | | ||||||
| | `upstream_name` | A string to identify the 'upstream' Transformer component to communicate with.  ~~str~~                                | | | `upstream_name` | A string to identify the 'upstream' Transformer component to communicate with. ~~str~~                                 | | ||||||
| | `grad_factor`   | Factor to multiply gradients with. ~~float~~                                                                           | | | `grad_factor`   | Factor to multiply gradients with. ~~float~~                                                                           | | ||||||
| | **CREATES**     | A model that returns the relevant vectors from an upstream transformer component. ~~Model[List[Doc], List[Floats2d]]~~ | | | **CREATES**     | A model that returns the relevant vectors from an upstream transformer component. ~~Model[List[Doc], List[Floats2d]]~~ | | ||||||
| 
 | 
 | ||||||
| 
 |  | ||||||
| ### spacy-curated-transformers.LastTransformerLayerListener.v1 | ### spacy-curated-transformers.LastTransformerLayerListener.v1 | ||||||
| 
 | 
 | ||||||
| Construct a listener layer that communicates with one or more upstream Transformer | Construct a listener layer that communicates with one or more upstream | ||||||
| components. This layer extracts the output of the last transformer layer and performs | Transformer components. This layer extracts the output of the last transformer | ||||||
| pooling over the individual pieces of each Doc token, returning their corresponding | layer and performs pooling over the individual pieces of each Doc token, | ||||||
| representations. The upstream name should either be the wildcard string '*', or the name of the Transformer component. | returning their corresponding representations. The upstream name should either | ||||||
|  | be the wildcard string '\*', or the name of the Transformer component. | ||||||
| 
 | 
 | ||||||
| In almost all cases, the wildcard string will suffice as there'll only be one | In almost all cases, the wildcard string will suffice as there'll only be one | ||||||
| upstream Transformer component. But in certain situations, e.g: you have disjoint | upstream Transformer component. But in certain situations, e.g: you have | ||||||
| datasets for certain tasks, or you'd like to use a pre-trained pipeline but a | disjoint datasets for certain tasks, or you'd like to use a pre-trained pipeline | ||||||
| downstream task requires its own token representations, you could end up with | but a downstream task requires its own token representations, you could end up | ||||||
| more than one Transformer component in the pipeline. | with more than one Transformer component in the pipeline. | ||||||
| 
 | 
 | ||||||
| | Name            | Description                                                                                                            | | | Name            | Description                                                                                                            | | ||||||
| |-----------------|------------------------------------------------------------------------------------------------------------------------| | | --------------- | ---------------------------------------------------------------------------------------------------------------------- | | ||||||
| | `width`         | The width of the vectors produced by the upstream transformer component. ~~int~~                                       | | | `width`         | The width of the vectors produced by the upstream transformer component. ~~int~~                                       | | ||||||
| | `pooling`       | Model that is used to perform pooling over the piece representations. ~~Model~~                                        | | | `pooling`       | Model that is used to perform pooling over the piece representations. ~~Model~~                                        | | ||||||
| | `upstream_name` | A string to identify the 'upstream' Transformer component to communicate with.  ~~str~~                                | | | `upstream_name` | A string to identify the 'upstream' Transformer component to communicate with. ~~str~~                                 | | ||||||
| | `grad_factor`   | Factor to multiply gradients with. ~~float~~                                                                           | | | `grad_factor`   | Factor to multiply gradients with. ~~float~~                                                                           | | ||||||
| | **CREATES**     | A model that returns the relevant vectors from an upstream transformer component. ~~Model[List[Doc], List[Floats2d]]~~ | | | **CREATES**     | A model that returns the relevant vectors from an upstream transformer component. ~~Model[List[Doc], List[Floats2d]]~~ | | ||||||
| 
 | 
 | ||||||
| 
 |  | ||||||
| ### spacy-curated-transformers.ScalarWeightingListener.v1 | ### spacy-curated-transformers.ScalarWeightingListener.v1 | ||||||
| 
 | 
 | ||||||
| Construct a listener layer that communicates with one or more upstream Transformer | Construct a listener layer that communicates with one or more upstream | ||||||
| components. This layer calculates a weighted representation of all transformer layer | Transformer components. This layer calculates a weighted representation of all | ||||||
| outputs and performs pooling over the individual pieces of each Doc token, returning | transformer layer outputs and performs pooling over the individual pieces of | ||||||
| their corresponding representations. | each Doc token, returning their corresponding representations. | ||||||
| 
 | 
 | ||||||
| Requires its upstream Transformer components to return all layer outputs from | Requires its upstream Transformer components to return all layer outputs from | ||||||
| their models.  The upstream name should either be the wildcard string '*', or the name of the Transformer component. | their models. The upstream name should either be the wildcard string '\*', or | ||||||
|  | the name of the Transformer component. | ||||||
| 
 | 
 | ||||||
| In almost all cases, the wildcard string will suffice as there'll only be one | In almost all cases, the wildcard string will suffice as there'll only be one | ||||||
| upstream Transformer component. But in certain situations, e.g: you have disjoint | upstream Transformer component. But in certain situations, e.g: you have | ||||||
| datasets for certain tasks, or you'd like to use a pre-trained pipeline but a | disjoint datasets for certain tasks, or you'd like to use a pre-trained pipeline | ||||||
| downstream task requires its own token representations, you could end up with | but a downstream task requires its own token representations, you could end up | ||||||
| more than one Transformer component in the pipeline. | with more than one Transformer component in the pipeline. | ||||||
| 
 | 
 | ||||||
| | Name            | Description                                                                                                            | | | Name            | Description                                                                                                            | | ||||||
| |-----------------|------------------------------------------------------------------------------------------------------------------------| | | --------------- | ---------------------------------------------------------------------------------------------------------------------- | | ||||||
| | `width`         | The width of the vectors produced by the upstream transformer component. ~~int~~                                       | | | `width`         | The width of the vectors produced by the upstream transformer component. ~~int~~                                       | | ||||||
| | `weighting`     | Model that is used to perform the weighting of the different layer outputs. ~~Model~~                                  | | | `weighting`     | Model that is used to perform the weighting of the different layer outputs. ~~Model~~                                  | | ||||||
| | `pooling`       | Model that is used to perform pooling over the piece representations. ~~Model~~                                        | | | `pooling`       | Model that is used to perform pooling over the piece representations. ~~Model~~                                        | | ||||||
| | `upstream_name` | A string to identify the 'upstream' Transformer component to communicate with.  ~~str~~                                | | | `upstream_name` | A string to identify the 'upstream' Transformer component to communicate with. ~~str~~                                 | | ||||||
| | `grad_factor`   | Factor to multiply gradients with. ~~float~~                                                                           | | | `grad_factor`   | Factor to multiply gradients with. ~~float~~                                                                           | | ||||||
| | **CREATES**     | A model that returns the relevant vectors from an upstream transformer component. ~~Model[List[Doc], List[Floats2d]]~~ | | | **CREATES**     | A model that returns the relevant vectors from an upstream transformer component. ~~Model[List[Doc], List[Floats2d]]~~ | | ||||||
| 
 | 
 | ||||||
| ### spacy-curated-transformers.BertWordpieceEncoder.v1 | ### spacy-curated-transformers.BertWordpieceEncoder.v1 | ||||||
| 
 | 
 | ||||||
| Construct a WordPiece piece encoder model that accepts a list | Construct a WordPiece piece encoder model that accepts a list of token sequences | ||||||
| of token sequences or documents and returns a corresponding list | or documents and returns a corresponding list of piece identifiers. This encoder | ||||||
| of piece identifiers. This encoder also splits each token | also splits each token on punctuation characters, as expected by most BERT | ||||||
| on punctuation characters, as expected by most BERT models. | models. | ||||||
| 
 | 
 | ||||||
| This model must be separately initialized using an appropriate | This model must be separately initialized using an appropriate loader. | ||||||
| loader. |  | ||||||
| 
 | 
 | ||||||
| ### spacy-curated-transformers.ByteBpeEncoder.v1 | ### spacy-curated-transformers.ByteBpeEncoder.v1 | ||||||
| 
 | 
 | ||||||
| Construct a Byte-BPE piece encoder model that accepts a list | Construct a Byte-BPE piece encoder model that accepts a list of token sequences | ||||||
| of token sequences or documents and returns a corresponding list | or documents and returns a corresponding list of piece identifiers. | ||||||
| of piece identifiers. |  | ||||||
| 
 | 
 | ||||||
| This model must be separately initialized using an appropriate | This model must be separately initialized using an appropriate loader. | ||||||
| loader. |  | ||||||
| 
 | 
 | ||||||
| ### spacy-curated-transformers.CamembertSentencepieceEncoder.v1 | ### spacy-curated-transformers.CamembertSentencepieceEncoder.v1 | ||||||
| Construct a SentencePiece piece encoder model that accepts a list |  | ||||||
| of token sequences or documents and returns a corresponding list |  | ||||||
| of piece identifiers with CamemBERT post-processing applied. |  | ||||||
| 
 | 
 | ||||||
| This model must be separately initialized using an appropriate | Construct a SentencePiece piece encoder model that accepts a list of token | ||||||
| loader. | sequences or documents and returns a corresponding list of piece identifiers | ||||||
|  | with CamemBERT post-processing applied. | ||||||
|  | 
 | ||||||
|  | This model must be separately initialized using an appropriate loader. | ||||||
| 
 | 
 | ||||||
| ### spacy-curated-transformers.CharEncoder.v1 | ### spacy-curated-transformers.CharEncoder.v1 | ||||||
| Construct a character piece encoder model that accepts a list |  | ||||||
| of token sequences or documents and returns a corresponding list |  | ||||||
| of piece identifiers. |  | ||||||
| 
 | 
 | ||||||
| This model must be separately initialized using an appropriate | Construct a character piece encoder model that accepts a list of token sequences | ||||||
| loader. | or documents and returns a corresponding list of piece identifiers. | ||||||
|  | 
 | ||||||
|  | This model must be separately initialized using an appropriate loader. | ||||||
| 
 | 
 | ||||||
| ### spacy-curated-transformers.SentencepieceEncoder.v1 | ### spacy-curated-transformers.SentencepieceEncoder.v1 | ||||||
| Construct a SentencePiece piece encoder model that accepts a list |  | ||||||
| of token sequences or documents and returns a corresponding list |  | ||||||
| of piece identifiers with CamemBERT post-processing applied. |  | ||||||
| 
 | 
 | ||||||
| This model must be separately initialized using an appropriate | Construct a SentencePiece piece encoder model that accepts a list of token | ||||||
| loader. | sequences or documents and returns a corresponding list of piece identifiers | ||||||
|  | with CamemBERT post-processing applied. | ||||||
|  | 
 | ||||||
|  | This model must be separately initialized using an appropriate loader. | ||||||
| 
 | 
 | ||||||
| ### spacy-curated-transformers.WordpieceEncoder.v1 | ### spacy-curated-transformers.WordpieceEncoder.v1 | ||||||
| Construct a WordPiece piece encoder model that accepts a list |  | ||||||
| of token sequences or documents and returns a corresponding list |  | ||||||
| of piece identifiers. This encoder also splits each token |  | ||||||
| on punctuation characters, as expected by most BERT models. |  | ||||||
| 
 | 
 | ||||||
| This model must be separately initialized using an appropriate | Construct a WordPiece piece encoder model that accepts a list of token sequences | ||||||
| loader. | or documents and returns a corresponding list of piece identifiers. This encoder | ||||||
|  | also splits each token on punctuation characters, as expected by most BERT | ||||||
|  | models. | ||||||
|  | 
 | ||||||
|  | This model must be separately initialized using an appropriate loader. | ||||||
| 
 | 
 | ||||||
| ### spacy-curated-transformers.XlmrSentencepieceEncoder.v1 | ### spacy-curated-transformers.XlmrSentencepieceEncoder.v1 | ||||||
| Construct a SentencePiece piece encoder model that accepts a list |  | ||||||
| of token sequences or documents and returns a corresponding list |  | ||||||
| of piece identifiers with XLM-RoBERTa post-processing applied. |  | ||||||
| 
 |  | ||||||
| This model must be separately initialized using an appropriate |  | ||||||
| loader. |  | ||||||
| 
 |  | ||||||
| 
 | 
 | ||||||
|  | Construct a SentencePiece piece encoder model that accepts a list of token | ||||||
|  | sequences or documents and returns a corresponding list of piece identifiers | ||||||
|  | with XLM-RoBERTa post-processing applied. | ||||||
| 
 | 
 | ||||||
|  | This model must be separately initialized using an appropriate loader. | ||||||
| 
 | 
 | ||||||
| ## Pretraining architectures {id="pretrain",source="spacy/ml/models/multi_task.py"} | ## Pretraining architectures {id="pretrain",source="spacy/ml/models/multi_task.py"} | ||||||
| 
 | 
 | ||||||
|  | @ -826,7 +813,7 @@ objective for a Tok2Vec layer. To use this objective, make sure that the | ||||||
| vectors. | vectors. | ||||||
| 
 | 
 | ||||||
| | Name            | Description                                                                                                                                               | | | Name            | Description                                                                                                                                               | | ||||||
| |-----------------|-----------------------------------------------------------------------------------------------------------------------------------------------------------| | | --------------- | --------------------------------------------------------------------------------------------------------------------------------------------------------- | | ||||||
| | `maxout_pieces` | The number of maxout pieces to use. Recommended values are `2` or `3`. ~~int~~                                                                            | | | `maxout_pieces` | The number of maxout pieces to use. Recommended values are `2` or `3`. ~~int~~                                                                            | | ||||||
| | `hidden_size`   | Size of the hidden layer of the model. ~~int~~                                                                                                            | | | `hidden_size`   | Size of the hidden layer of the model. ~~int~~                                                                                                            | | ||||||
| | `loss`          | The loss function can be either "cosine" or "L2". We typically recommend to use "cosine". ~~~str~~                                                        | | | `loss`          | The loss function can be either "cosine" or "L2". We typically recommend to use "cosine". ~~~str~~                                                        | | ||||||
|  |  | ||||||
		Loading…
	
		Reference in New Issue
	
	Block a user