Update website/docs/usage/embeddings-transformers.mdx

This commit is contained in:
Adriane Boyd 2023-04-03 10:57:36 +02:00 committed by GitHub
parent be24c0a0b7
commit a562767336
No known key found for this signature in database
GPG Key ID: 4AEE18F83AFDEB23

View File

@ -745,12 +745,12 @@ To benefit from pretraining, your training step needs to know to initialize its
this by setting `initialize.init_tok2vec` to the filename of the `.bin` file
that you want to use from pretraining.
Similar to training, pretraining produces a `model-last.bin` file which is the
last iteration of the trained weights which you can use to initialize your
`tok2vec` layer. Additionally, you can configure `n_save_epoch` to tell
pretraining in which epoch interval it should save the current training
progress. To make use of the final output, you could fill in this value in your
config file:
A pretraining step that runs for 5 epochs with an output path of `pretrain/`, as
an example, produces `pretrain/model0.bin` through `pretrain/model4.bin` plus a
copy of the last iteration as `pretrain/model-last.bin`. Additionally, you can
configure `n_save_epoch` to tell pretraining in which epoch interval it should
save the current training progress. To use the final output to initialize your
`tok2vec` layer, you could fill in this value in your config file:
```ini {title="config.cfg"}