Fix duplicate entries in tables

This commit is contained in:
shadeMe 2023-07-20 16:05:42 +02:00
parent a775fa25ad
commit cca478152e
No known key found for this signature in database
GPG Key ID: 6FCA9FC635B2A402

View File

@ -493,18 +493,16 @@ how to integrate the architectures into your training config.
Construct an ALBERT transformer model.
| Name | Description |
| ------------------------------ | --------------------------------------------------------------------------- |
| ------------------------------ | ---------------------------------------------------------------------------------------- |
| `vocab_size` | Vocabulary size. ~~int~~ |
| `with_spans` | Callback that constructs a span generator model. ~~Callable~~ |
| `piece_encoder` | The piece encoder to segment input tokens. ~~Model~~ |
| `attention_probs_dropout_prob` | Dropout probabilty of the self-attention layers. ~~float~~ |
| `embedding_width` | Width of the embedding representations. ~~int~~ |
| `hidden_act` | Activation used by the point-wise feed-forward layers. ~~str~~ |
| `hidden_dropout_prob` | Dropout probabilty of the point-wise feed-forward and ~~float~~ |
| `hidden_dropout_prob` | embedding layers. ~~float~~ |
| `hidden_dropout_prob` | Dropout probabilty of the point-wise feed-forward and embedding layers. ~~float~~ |
| `hidden_width` | Width of the final representations. ~~int~~ |
| `intermediate_width` | Width of the intermediate projection layer in the ~~int~~ |
| `intermediate_width` | point-wise feed-forward layer. ~~int~~ |
| `intermediate_width` | Width of the intermediate projection layer in the point-wise feed-forward layer. ~~int~~ |
| `layer_norm_eps` | Epsilon for layer normalization. ~~float~~ |
| `max_position_embeddings` | Maximum length of position embeddings. ~~int~~ |
| `model_max_length` | Maximum length of model inputs. ~~int~~ |
@ -522,17 +520,15 @@ Construct an ALBERT transformer model.
Construct a BERT transformer model.
| Name | Description |
| ------------------------------ | --------------------------------------------------------------------------- |
| ------------------------------ | ---------------------------------------------------------------------------------------- |
| `vocab_size` | Vocabulary size. ~~int~~ |
| `with_spans` | Callback that constructs a span generator model. ~~Callable~~ |
| `piece_encoder` | The piece encoder to segment input tokens. ~~Model~~ |
| `attention_probs_dropout_prob` | Dropout probabilty of the self-attention layers. ~~float~~ |
| `hidden_act` | Activation used by the point-wise feed-forward layers. ~~str~~ |
| `hidden_dropout_prob` | Dropout probabilty of the point-wise feed-forward and ~~float~~ |
| `hidden_dropout_prob` | embedding layers. ~~float~~ |
| `hidden_dropout_prob` | Dropout probabilty of the point-wise feed-forward and embedding layers. ~~float~~ |
| `hidden_width` | Width of the final representations. ~~int~~ |
| `intermediate_width` | Width of the intermediate projection layer in the ~~int~~ |
| `intermediate_width` | point-wise feed-forward layer. ~~int~~ |
| `intermediate_width` | Width of the intermediate projection layer in the point-wise feed-forward layer. ~~int~~ |
| `layer_norm_eps` | Epsilon for layer normalization. ~~float~~ |
| `max_position_embeddings` | Maximum length of position embeddings. ~~int~~ |
| `model_max_length` | Maximum length of model inputs. ~~int~~ |
@ -549,17 +545,15 @@ Construct a BERT transformer model.
Construct a CamemBERT transformer model.
| Name | Description |
| ------------------------------ | --------------------------------------------------------------------------- |
| ------------------------------ | ---------------------------------------------------------------------------------------- |
| `vocab_size` | Vocabulary size. ~~int~~ |
| `with_spans` | Callback that constructs a span generator model. ~~Callable~~ |
| `piece_encoder` | The piece encoder to segment input tokens. ~~Model~~ |
| `attention_probs_dropout_prob` | Dropout probabilty of the self-attention layers. ~~float~~ |
| `hidden_act` | Activation used by the point-wise feed-forward layers. ~~str~~ |
| `hidden_dropout_prob` | Dropout probabilty of the point-wise feed-forward and ~~float~~ |
| `hidden_dropout_prob` | embedding layers. ~~float~~ |
| `hidden_dropout_prob` | Dropout probabilty of the point-wise feed-forward and embedding layers. ~~float~~ |
| `hidden_width` | Width of the final representations. ~~int~~ |
| `intermediate_width` | Width of the intermediate projection layer in the ~~int~~ |
| `intermediate_width` | point-wise feed-forward layer. ~~int~~ |
| `intermediate_width` | Width of the intermediate projection layer in the point-wise feed-forward layer. ~~int~~ |
| `layer_norm_eps` | Epsilon for layer normalization. ~~float~~ |
| `max_position_embeddings` | Maximum length of position embeddings. ~~int~~ |
| `model_max_length` | Maximum length of model inputs. ~~int~~ |
@ -576,17 +570,15 @@ Construct a CamemBERT transformer model.
Construct a RoBERTa transformer model.
| Name | Description |
| ------------------------------ | --------------------------------------------------------------------------- |
| ------------------------------ | ---------------------------------------------------------------------------------------- |
| `vocab_size` | Vocabulary size. ~~int~~ |
| `with_spans` | Callback that constructs a span generator model. ~~Callable~~ |
| `piece_encoder` | The piece encoder to segment input tokens. ~~Model~~ |
| `attention_probs_dropout_prob` | Dropout probabilty of the self-attention layers. ~~float~~ |
| `hidden_act` | Activation used by the point-wise feed-forward layers. ~~str~~ |
| `hidden_dropout_prob` | Dropout probabilty of the point-wise feed-forward and ~~float~~ |
| `hidden_dropout_prob` | embedding layers. ~~float~~ |
| `hidden_dropout_prob` | Dropout probabilty of the point-wise feed-forward and embedding layers. ~~float~~ |
| `hidden_width` | Width of the final representations. ~~int~~ |
| `intermediate_width` | Width of the intermediate projection layer in the ~~int~~ |
| `intermediate_width` | point-wise feed-forward layer. ~~int~~ |
| `intermediate_width` | Width of the intermediate projection layer in the point-wise feed-forward layer. ~~int~~ |
| `layer_norm_eps` | Epsilon for layer normalization. ~~float~~ |
| `max_position_embeddings` | Maximum length of position embeddings. ~~int~~ |
| `model_max_length` | Maximum length of model inputs. ~~int~~ |
@ -603,17 +595,15 @@ Construct a RoBERTa transformer model.
Construct a XLM-RoBERTa transformer model.
| Name | Description |
| ------------------------------ | --------------------------------------------------------------------------- |
| ------------------------------ | ---------------------------------------------------------------------------------------- |
| `vocab_size` | Vocabulary size. ~~int~~ |
| `with_spans` | Callback that constructs a span generator model. ~~Callable~~ |
| `piece_encoder` | The piece encoder to segment input tokens. ~~Model~~ |
| `attention_probs_dropout_prob` | Dropout probabilty of the self-attention layers. ~~float~~ |
| `hidden_act` | Activation used by the point-wise feed-forward layers. ~~str~~ |
| `hidden_dropout_prob` | Dropout probabilty of the point-wise feed-forward and ~~float~~ |
| `hidden_dropout_prob` | embedding layers. ~~float~~ |
| `hidden_dropout_prob` | Dropout probabilty of the point-wise feed-forward and embedding layers. ~~float~~ |
| `hidden_width` | Width of the final representations. ~~int~~ |
| `intermediate_width` | Width of the intermediate projection layer in the ~~int~~ |
| `intermediate_width` | point-wise feed-forward layer. ~~int~~ |
| `intermediate_width` | Width of the intermediate projection layer in the point-wise feed-forward layer. ~~int~~ |
| `layer_norm_eps` | Epsilon for layer normalization. ~~float~~ |
| `max_position_embeddings` | Maximum length of position embeddings. ~~int~~ |
| `model_max_length` | Maximum length of model inputs. ~~int~~ |