Add how to load probability tables to existing models to spaCy docs (#12051)

* add section about adding tables to models * change to lexeme_norm * Change syntax * change to _prob * Update website/docs/usage/saving-loading.mdx Co-authored-by: Adriane Boyd <adrianeboyd@gmail.com> Co-authored-by: Adriane Boyd <adrianeboyd@gmail.com>
2025-07-29 01:19:50 +03:00 · 2023-01-24 10:01:22 +01:00 · 2023-01-24 10:01:22 +01:00 · e9048fd4a1
commit e9048fd4a1
parent 950fceceb6
1 changed files with 22 additions and 0 deletions
--- a/website/docs/usage/saving-loading.mdx
+++ b/website/docs/usage/saving-loading.mdx
@ -304,6 +304,28 @@ installed in the same environment – that's it.
 | `spacy_lookups`                                   | Group of entry points for custom [`Lookups`](/api/lookups), including lemmatizer data. Used by spaCy's [`spacy-lookups-data`](https://github.com/explosion/spacy-lookups-data) package.                                                                  |
 | [`spacy_displacy_colors`](#entry-points-displacy) | Group of entry points of custom label colors for the [displaCy visualizer](/usage/visualizers#ent). The key name doesn't matter, but it should point to a dict of labels and color values. Useful for custom models that predict different entity types. |

+### Loading probability tables into existing models
+
+You can load a probability table from [spacy-lookups-data](https://github.com/explosion/spacy-lookups-data) into an existing spaCy model like `en_core_web_sm`.
+
+```python
+# Requirements: pip install spacy-lookups-data
+import spacy
+from spacy.lookups import load_lookups
+nlp = spacy.load("en_core_web_sm")
+lookups = load_lookups("en", ["lexeme_prob"])
+nlp.vocab.lookups.add_table("lexeme_prob", lookups.get_table("lexeme_prob"))
+```
+
+When training a model from scratch you can also specify probability tables in the `config.cfg`.
+
+```ini {title="config.cfg (excerpt)"}
+[initialize.lookups]
+@misc = "spacy.LookupsDataLoader.v1"
+lang = ${nlp.lang}
+tables = ["lexeme_prob"]
+```
+
 ### Custom components via entry points {id="entry-points-components"}

 When you load a pipeline, spaCy will generally use its `config.cfg` to set up