Fix duplicate entries in tables

2025-09-16 17:12:38 +03:00 · 2023-07-20 16:05:42 +02:00 · 2023-07-20 16:05:42 +02:00 · cca478152e
commit cca478152e
parent a775fa25ad
1 changed files with 102 additions and 112 deletions
--- a/website/docs/api/architectures.mdx
+++ b/website/docs/api/architectures.mdx
@ -493,18 +493,16 @@ how to integrate the architectures into your training config.
 Construct an ALBERT transformer model.

 | Name                           | Description                                                                              |
-| ------------------------------ | --------------------------------------------------------------------------- |
+| ------------------------------ | ---------------------------------------------------------------------------------------- |
 | `vocab_size`                   | Vocabulary size. ~~int~~                                                                 |
 | `with_spans`                   | Callback that constructs a span generator model. ~~Callable~~                            |
 | `piece_encoder`                | The piece encoder to segment input tokens. ~~Model~~                                     |
 | `attention_probs_dropout_prob` | Dropout probabilty of the self-attention layers. ~~float~~                               |
 | `embedding_width`              | Width of the embedding representations. ~~int~~                                          |
 | `hidden_act`                   | Activation used by the point-wise feed-forward layers. ~~str~~                           |
-| `hidden_dropout_prob`          | Dropout probabilty of the point-wise feed-forward and ~~float~~             |
-| `hidden_dropout_prob`          | embedding layers. ~~float~~                                                 |
+| `hidden_dropout_prob`          | Dropout probabilty of the point-wise feed-forward and embedding layers. ~~float~~        |
 | `hidden_width`                 | Width of the final representations. ~~int~~                                              |
-| `intermediate_width`           | Width of the intermediate projection layer in the ~~int~~                   |
-| `intermediate_width`           | point-wise feed-forward layer. ~~int~~                                      |
+| `intermediate_width`           | Width of the intermediate projection layer in the point-wise feed-forward layer. ~~int~~ |
 | `layer_norm_eps`               | Epsilon for layer normalization. ~~float~~                                               |
 | `max_position_embeddings`      | Maximum length of position embeddings. ~~int~~                                           |
 | `model_max_length`             | Maximum length of model inputs. ~~int~~                                                  |
@ -522,17 +520,15 @@ Construct an ALBERT transformer model.
 Construct a BERT transformer model.

 | Name                           | Description                                                                              |
-| ------------------------------ | --------------------------------------------------------------------------- |
+| ------------------------------ | ---------------------------------------------------------------------------------------- |
 | `vocab_size`                   | Vocabulary size. ~~int~~                                                                 |
 | `with_spans`                   | Callback that constructs a span generator model. ~~Callable~~                            |
 | `piece_encoder`                | The piece encoder to segment input tokens. ~~Model~~                                     |
 | `attention_probs_dropout_prob` | Dropout probabilty of the self-attention layers. ~~float~~                               |
 | `hidden_act`                   | Activation used by the point-wise feed-forward layers. ~~str~~                           |
-| `hidden_dropout_prob`          | Dropout probabilty of the point-wise feed-forward and ~~float~~             |
-| `hidden_dropout_prob`          | embedding layers. ~~float~~                                                 |
+| `hidden_dropout_prob`          | Dropout probabilty of the point-wise feed-forward and embedding layers. ~~float~~        |
 | `hidden_width`                 | Width of the final representations. ~~int~~                                              |
-| `intermediate_width`           | Width of the intermediate projection layer in the ~~int~~                   |
-| `intermediate_width`           | point-wise feed-forward layer. ~~int~~                                      |
+| `intermediate_width`           | Width of the intermediate projection layer in the point-wise feed-forward layer. ~~int~~ |
 | `layer_norm_eps`               | Epsilon for layer normalization. ~~float~~                                               |
 | `max_position_embeddings`      | Maximum length of position embeddings. ~~int~~                                           |
 | `model_max_length`             | Maximum length of model inputs. ~~int~~                                                  |
@ -549,17 +545,15 @@ Construct a BERT transformer model.
 Construct a CamemBERT transformer model.

 | Name                           | Description                                                                              |
-| ------------------------------ | --------------------------------------------------------------------------- |
+| ------------------------------ | ---------------------------------------------------------------------------------------- |
 | `vocab_size`                   | Vocabulary size. ~~int~~                                                                 |
 | `with_spans`                   | Callback that constructs a span generator model. ~~Callable~~                            |
 | `piece_encoder`                | The piece encoder to segment input tokens. ~~Model~~                                     |
 | `attention_probs_dropout_prob` | Dropout probabilty of the self-attention layers. ~~float~~                               |
 | `hidden_act`                   | Activation used by the point-wise feed-forward layers. ~~str~~                           |
-| `hidden_dropout_prob`          | Dropout probabilty of the point-wise feed-forward and ~~float~~             |
-| `hidden_dropout_prob`          | embedding layers. ~~float~~                                                 |
+| `hidden_dropout_prob`          | Dropout probabilty of the point-wise feed-forward and embedding layers. ~~float~~        |
 | `hidden_width`                 | Width of the final representations. ~~int~~                                              |
-| `intermediate_width`           | Width of the intermediate projection layer in the ~~int~~                   |
-| `intermediate_width`           | point-wise feed-forward layer. ~~int~~                                      |
+| `intermediate_width`           | Width of the intermediate projection layer in the point-wise feed-forward layer. ~~int~~ |
 | `layer_norm_eps`               | Epsilon for layer normalization. ~~float~~                                               |
 | `max_position_embeddings`      | Maximum length of position embeddings. ~~int~~                                           |
 | `model_max_length`             | Maximum length of model inputs. ~~int~~                                                  |
@ -576,17 +570,15 @@ Construct a CamemBERT transformer model.
 Construct a RoBERTa transformer model.

 | Name                           | Description                                                                              |
-| ------------------------------ | --------------------------------------------------------------------------- |
+| ------------------------------ | ---------------------------------------------------------------------------------------- |
 | `vocab_size`                   | Vocabulary size. ~~int~~                                                                 |
 | `with_spans`                   | Callback that constructs a span generator model. ~~Callable~~                            |
 | `piece_encoder`                | The piece encoder to segment input tokens. ~~Model~~                                     |
 | `attention_probs_dropout_prob` | Dropout probabilty of the self-attention layers. ~~float~~                               |
 | `hidden_act`                   | Activation used by the point-wise feed-forward layers. ~~str~~                           |
-| `hidden_dropout_prob`          | Dropout probabilty of the point-wise feed-forward and ~~float~~             |
-| `hidden_dropout_prob`          | embedding layers. ~~float~~                                                 |
+| `hidden_dropout_prob`          | Dropout probabilty of the point-wise feed-forward and embedding layers. ~~float~~        |
 | `hidden_width`                 | Width of the final representations. ~~int~~                                              |
-| `intermediate_width`           | Width of the intermediate projection layer in the ~~int~~                   |
-| `intermediate_width`           | point-wise feed-forward layer. ~~int~~                                      |
+| `intermediate_width`           | Width of the intermediate projection layer in the point-wise feed-forward layer. ~~int~~ |
 | `layer_norm_eps`               | Epsilon for layer normalization. ~~float~~                                               |
 | `max_position_embeddings`      | Maximum length of position embeddings. ~~int~~                                           |
 | `model_max_length`             | Maximum length of model inputs. ~~int~~                                                  |
@ -603,17 +595,15 @@ Construct a RoBERTa transformer model.
 Construct a XLM-RoBERTa transformer model.

 | Name                           | Description                                                                              |
-| ------------------------------ | --------------------------------------------------------------------------- |
+| ------------------------------ | ---------------------------------------------------------------------------------------- |
 | `vocab_size`                   | Vocabulary size. ~~int~~                                                                 |
 | `with_spans`                   | Callback that constructs a span generator model. ~~Callable~~                            |
 | `piece_encoder`                | The piece encoder to segment input tokens. ~~Model~~                                     |
 | `attention_probs_dropout_prob` | Dropout probabilty of the self-attention layers. ~~float~~                               |
 | `hidden_act`                   | Activation used by the point-wise feed-forward layers. ~~str~~                           |
-| `hidden_dropout_prob`          | Dropout probabilty of the point-wise feed-forward and ~~float~~             |
-| `hidden_dropout_prob`          | embedding layers. ~~float~~                                                 |
+| `hidden_dropout_prob`          | Dropout probabilty of the point-wise feed-forward and embedding layers. ~~float~~        |
 | `hidden_width`                 | Width of the final representations. ~~int~~                                              |
-| `intermediate_width`           | Width of the intermediate projection layer in the ~~int~~                   |
-| `intermediate_width`           | point-wise feed-forward layer. ~~int~~                                      |
+| `intermediate_width`           | Width of the intermediate projection layer in the point-wise feed-forward layer. ~~int~~ |
 | `layer_norm_eps`               | Epsilon for layer normalization. ~~float~~                                               |
 | `max_position_embeddings`      | Maximum length of position embeddings. ~~int~~                                           |
 | `model_max_length`             | Maximum length of model inputs. ~~int~~                                                  |