mirror of
https://github.com/explosion/spaCy.git
synced 2025-02-15 02:50:33 +03:00
Update pretrain docs
This commit is contained in:
parent
f3aba49830
commit
a12a61400c
|
@ -754,7 +754,7 @@ in the section `[paths]`.
|
||||||
</Infobox>
|
</Infobox>
|
||||||
|
|
||||||
```cli
|
```cli
|
||||||
$ python -m spacy train [config_path] [--output] [--code] [--verbose] [overrides]
|
$ python -m spacy train [config_path] [--output] [--code] [--verbose] [--gpu-id] [overrides]
|
||||||
```
|
```
|
||||||
|
|
||||||
| Name | Description |
|
| Name | Description |
|
||||||
|
@ -778,8 +778,8 @@ pretrained ones. The weights are saved to a directory after each epoch. You can
|
||||||
then include a **path to one of these pretrained weights files** in your
|
then include a **path to one of these pretrained weights files** in your
|
||||||
[training config](/usage/training#config) as the `init_tok2vec` setting when you
|
[training config](/usage/training#config) as the `init_tok2vec` setting when you
|
||||||
train your pipeline. This technique may be especially helpful if you have little
|
train your pipeline. This technique may be especially helpful if you have little
|
||||||
labelled data. See the usage docs on [pretraining](/usage/training#pretraining)
|
labelled data. See the usage docs on
|
||||||
for more info.
|
[pretraining](/usage/embeddings-transformers#pretraining) for more info.
|
||||||
|
|
||||||
<Infobox title="Changed in v3.0" variant="warning">
|
<Infobox title="Changed in v3.0" variant="warning">
|
||||||
|
|
||||||
|
@ -794,7 +794,7 @@ auto-generated by setting `--pretraining` on
|
||||||
</Infobox>
|
</Infobox>
|
||||||
|
|
||||||
```cli
|
```cli
|
||||||
$ python -m spacy pretrain [config_path] [output_dir] [--code] [--resume-path] [--epoch-resume] [overrides]
|
$ python -m spacy pretrain [config_path] [output_dir] [--code] [--resume-path] [--epoch-resume] [--gpu-id] [overrides]
|
||||||
```
|
```
|
||||||
|
|
||||||
| Name | Description |
|
| Name | Description |
|
||||||
|
|
|
@ -752,7 +752,7 @@ network to model something about word cooccurrence statistics. Predicting
|
||||||
leading and trailing characters does that more than adequately, as the exact
|
leading and trailing characters does that more than adequately, as the exact
|
||||||
word sequence could be recovered with high accuracy if the initial and trailing
|
word sequence could be recovered with high accuracy if the initial and trailing
|
||||||
characters are predicted accurately. With the vectors objective, the pretraining
|
characters are predicted accurately. With the vectors objective, the pretraining
|
||||||
is use the embedding space learned by an algorithm such as
|
uses the embedding space learned by an algorithm such as
|
||||||
[GloVe](https://nlp.stanford.edu/projects/glove/) or
|
[GloVe](https://nlp.stanford.edu/projects/glove/) or
|
||||||
[Word2vec](https://code.google.com/archive/p/word2vec/), allowing the model to
|
[Word2vec](https://code.google.com/archive/p/word2vec/), allowing the model to
|
||||||
focus on the contextual modelling we actual care about.
|
focus on the contextual modelling we actual care about.
|
||||||
|
|
|
@ -175,7 +175,7 @@ sections of a config file are:
|
||||||
| `paths` | Paths to data and other assets. Re-used across the config as variables, e.g. `${paths.train}`, and can be [overwritten](#config-overrides) on the CLI. |
|
| `paths` | Paths to data and other assets. Re-used across the config as variables, e.g. `${paths.train}`, and can be [overwritten](#config-overrides) on the CLI. |
|
||||||
| `system` | Settings related to system and hardware. Re-used across the config as variables, e.g. `${system.seed}`, and can be [overwritten](#config-overrides) on the CLI. |
|
| `system` | Settings related to system and hardware. Re-used across the config as variables, e.g. `${system.seed}`, and can be [overwritten](#config-overrides) on the CLI. |
|
||||||
| `training` | Settings and controls for the training and evaluation process. |
|
| `training` | Settings and controls for the training and evaluation process. |
|
||||||
| `pretraining` | Optional settings and controls for the [language model pretraining](#pretraining). |
|
| `pretraining` | Optional settings and controls for the [language model pretraining](/usage/embeddings-transformers#pretraining). |
|
||||||
|
|
||||||
<Infobox title="Config format and settings" emoji="📖">
|
<Infobox title="Config format and settings" emoji="📖">
|
||||||
|
|
||||||
|
|
Loading…
Reference in New Issue
Block a user