Add how to load probability tables to existing models to spaCy docs (#12051)

* add section about adding tables to models

* change to lexeme_norm

* Change syntax

* change to _prob

* Update website/docs/usage/saving-loading.mdx

Co-authored-by: Adriane Boyd <adrianeboyd@gmail.com>

Co-authored-by: Adriane Boyd <adrianeboyd@gmail.com>
This commit is contained in:
Edward 2023-01-24 10:01:22 +01:00 committed by GitHub
parent 950fceceb6
commit e9048fd4a1
No known key found for this signature in database
GPG Key ID: 4AEE18F83AFDEB23

View File

@ -304,6 +304,28 @@ installed in the same environment that's it.
| `spacy_lookups` | Group of entry points for custom [`Lookups`](/api/lookups), including lemmatizer data. Used by spaCy's [`spacy-lookups-data`](https://github.com/explosion/spacy-lookups-data) package. |
| [`spacy_displacy_colors`](#entry-points-displacy) | Group of entry points of custom label colors for the [displaCy visualizer](/usage/visualizers#ent). The key name doesn't matter, but it should point to a dict of labels and color values. Useful for custom models that predict different entity types. |
### Loading probability tables into existing models
You can load a probability table from [spacy-lookups-data](https://github.com/explosion/spacy-lookups-data) into an existing spaCy model like `en_core_web_sm`.
```python
# Requirements: pip install spacy-lookups-data
import spacy
from spacy.lookups import load_lookups
nlp = spacy.load("en_core_web_sm")
lookups = load_lookups("en", ["lexeme_prob"])
nlp.vocab.lookups.add_table("lexeme_prob", lookups.get_table("lexeme_prob"))
```
When training a model from scratch you can also specify probability tables in the `config.cfg`.
```ini {title="config.cfg (excerpt)"}
[initialize.lookups]
@misc = "spacy.LookupsDataLoader.v1"
lang = ${nlp.lang}
tables = ["lexeme_prob"]
```
### Custom components via entry points {id="entry-points-components"}
When you load a pipeline, spaCy will generally use its `config.cfg` to set up