diff --git a/website/docs/api/transformer.md b/website/docs/api/transformer.md index c32651e02..b09455b41 100644 --- a/website/docs/api/transformer.md +++ b/website/docs/api/transformer.md @@ -29,7 +29,7 @@ This pipeline component lets you use transformer models in your pipeline. Supports all models that are available via the [HuggingFace `transformers`](https://huggingface.co/transformers) library. Usually you will connect subsequent components to the shared transformer using -the [TransformerListener](/api/architectures#TransformerListener) layer. This +the [TransformerListener](/api/architectures##transformers-Tok2VecListener) layer. This works similarly to spaCy's [Tok2Vec](/api/tok2vec) component and [Tok2VecListener](/api/architectures/Tok2VecListener) sublayer. @@ -233,7 +233,7 @@ The `Transformer` component therefore does **not** perform a weight update during its own `update` method. Instead, it runs its transformer model and communicates the output and the backpropagation callback to any **downstream components** that have been connected to it via the -[TransformerListener](/api/architectures#TransformerListener) sublayer. If there +[TransformerListener](/api/architectures##transformers-Tok2VecListener) sublayer. If there are multiple listeners, the last layer will actually backprop to the transformer and call the optimizer, while the others simply increment the gradients. diff --git a/website/docs/usage/embeddings-transformers.md b/website/docs/usage/embeddings-transformers.md index e2c1a6fd0..b5f58927a 100644 --- a/website/docs/usage/embeddings-transformers.md +++ b/website/docs/usage/embeddings-transformers.md @@ -101,7 +101,7 @@ it processes a batch of documents, it will pass forward its predictions to the listeners, allowing the listeners to **reuse the predictions** when they are eventually called. A similar mechanism is used to pass gradients from the listeners back to the model. The [`Transformer`](/api/transformer) component and -[TransformerListener](/api/architectures#TransformerListener) layer do the same +[TransformerListener](/api/architectures#transformers-Tok2VecListener) layer do the same thing for transformer models, but the `Transformer` component will also save the transformer outputs to the [`Doc._.trf_data`](/api/transformer#custom_attributes) extension attribute, @@ -179,7 +179,7 @@ interoperates with [PyTorch](https://pytorch.org) and the giving you access to thousands of pretrained models for your pipelines. There are many [great guides](http://jalammar.github.io/illustrated-transformer/) to transformer models, but for practical purposes, you can simply think of them as -a drop-in replacement that let you achieve **higher accuracy** in exchange for +drop-in replacements that let you achieve **higher accuracy** in exchange for **higher training and runtime costs**. ### Setup and installation {#transformers-installation}