mirror of
https://github.com/explosion/spaCy.git
synced 2025-10-24 12:41:23 +03:00
Similar to how vectors are handled, move the vocab lookups to be loaded
at the start of training rather than when the vocab is initialized,
since the vocab doesn't have access to the full config when it's
created.
The option moves from `nlp.load_vocab_data` to `training.lookups`.
Typically these tables will come from `spacy-lookups-data`, but any
`Lookups` object can be provided.
The loading from `spacy-lookups-data` is now strict, so configs for each
language should specify the exact tables required. This also makes it
easier to control whether the larger clusters and probs tables are
included.
To load `lexeme_norm` from `spacy-lookups-data`:
```
[training.lookups]
@misc = "spacy.LoadLookupsData.v1"
lang = ${nlp.lang}
tables = ["lexeme_norm"]
```
|
||
|---|---|---|
| .. | ||
| project | ||
| templates | ||
| __init__.py | ||
| _util.py | ||
| convert.py | ||
| debug_config.py | ||
| debug_data.py | ||
| debug_model.py | ||
| download.py | ||
| evaluate.py | ||
| info.py | ||
| init_config.py | ||
| init_model.py | ||
| package.py | ||
| pretrain.py | ||
| profile.py | ||
| train.py | ||
| validate.py | ||