diff --git a/spacy/ml/models/tok2vec.py b/spacy/ml/models/tok2vec.py index 23cfe883b..1a78cf75e 100644 --- a/spacy/ml/models/tok2vec.py +++ b/spacy/ml/models/tok2vec.py @@ -110,7 +110,7 @@ def MultiHashEmbed( The features used can be configured with the 'attrs' argument. The suggested attributes are NORM, PREFIX, SUFFIX and SHAPE. This lets the model take into - account some subword information, without construction a fully character-based + account some subword information, without constructing a fully character-based representation. If pretrained vectors are available, they can be included in the representation as well, with the vectors table will be kept static (i.e. it's not updated). diff --git a/website/docs/usage/embeddings-transformers.md b/website/docs/usage/embeddings-transformers.md index 73540b3d3..856685dad 100644 --- a/website/docs/usage/embeddings-transformers.md +++ b/website/docs/usage/embeddings-transformers.md @@ -516,16 +516,14 @@ Many neural network models are able to use word vector tables as additional features, which sometimes results in significant improvements in accuracy. spaCy's built-in embedding layer, [MultiHashEmbed](/api/architectures#MultiHashEmbed), can be configured to use -word vector tables using the `also_use_static_vectors` flag. This setting is -also available on the [MultiHashEmbedCNN](/api/architectures#MultiHashEmbedCNN) -layer, which builds the default token-to-vector encoding architecture. +word vector tables using the `include_static_vectors` flag. ```ini [tagger.model.tok2vec.embed] @architectures = "spacy.MultiHashEmbed.v1" width = 128 -rows = 7000 -also_embed_subwords = true +attrs = ["NORM", "PREFIX", "SUFFIX", "SHAPE"] +rows = [7000, 3500, 3500, 3500] also_use_static_vectors = true ```