mirror of
https://github.com/explosion/spaCy.git
synced 2025-01-27 09:44:36 +03:00
Notes on source with vectors
This commit is contained in:
parent
35425d7e26
commit
92dc6b409e
|
@ -220,4 +220,34 @@ working as expected, you can update the spaCy version requirements in the
|
||||||
+ "spacy_version": ">=3.0.0,<3.2.0",
|
+ "spacy_version": ">=3.0.0,<3.2.0",
|
||||||
```
|
```
|
||||||
|
|
||||||
<!-- TODO: vectors initialization and anything else we want to mention -->
|
### Sourcing pipeline components with vectors {#source-vectors}
|
||||||
|
|
||||||
|
If you're sourcing a pipeline component that requires static vectors (for
|
||||||
|
example, a tagger or parser from an `md` or `lg` pretrained pipeline), be sure
|
||||||
|
to include the source model's vectors in the setting `[initialize.vectors]`. In
|
||||||
|
spaCy v3.0, a bug allowed vectors to be loaded implicitly through `source`,
|
||||||
|
however in v3.1 this setting must be provided explicitly as
|
||||||
|
`[initialize.vectors]`:
|
||||||
|
|
||||||
|
```ini
|
||||||
|
### config.cfg (excerpt)
|
||||||
|
[components.ner]
|
||||||
|
source = "en_core_web_md"
|
||||||
|
|
||||||
|
[initialize]
|
||||||
|
vectors = "en_core_web_md"
|
||||||
|
```
|
||||||
|
|
||||||
|
<Infobox title="Important note" variant="warning">
|
||||||
|
|
||||||
|
Each pipeline can only store one set of static vectors, so it's not possible to
|
||||||
|
assemble a pipeline with components that were trained on different static
|
||||||
|
vectors.
|
||||||
|
|
||||||
|
</Infobox>
|
||||||
|
|
||||||
|
[`spacy train`](/api/cli#train) and [`spacy assemble`](/api/cli#assemble) will
|
||||||
|
provide warnings if the source and target pipelines don't contain the same
|
||||||
|
vectors. If you are sourcing a rule-based component like an entity ruler or
|
||||||
|
lemmatizer that does not use the vectors as a model feature, then this warning
|
||||||
|
can be safely ignored.
|
||||||
|
|
Loading…
Reference in New Issue
Block a user