diff --git a/website/docs/api/annotation.md b/website/docs/api/annotation.md index fb8b67c1e..34065de91 100644 --- a/website/docs/api/annotation.md +++ b/website/docs/api/annotation.md @@ -501,7 +501,7 @@ entities: > than the **BILUO** scheme that we use, which explicitly marks boundary tokens. spaCy translates the character offsets into this scheme, in order to decide the -cost of each action given the current state of the entity recogniser. The costs +cost of each action given the current state of the entity recognizer. The costs are then used to calculate the gradient of the loss, to train the model. The exact algorithm is a pastiche of well-known methods, and is not currently described in any single publication. The model is a greedy transition-based diff --git a/website/docs/api/entitylinker.md b/website/docs/api/entitylinker.md index 88131761f..a9d6a31a5 100644 --- a/website/docs/api/entitylinker.md +++ b/website/docs/api/entitylinker.md @@ -46,9 +46,9 @@ shortcut for this and instantiate the component using its string name and | -------------- | ----------------------------- | ----------------------------------------------------------------------------------------------------------------------------------------------------- | | `vocab` | `Vocab` | The shared vocabulary. | | `model` | `thinc.neural.Model` / `True` | The model powering the pipeline component. If no model is supplied, the model is created when you call `begin_training`, `from_disk` or `from_bytes`. | -| `hidden_width` | int | Width of the hidden layer of the entity linking model, defaults to 128. | -| `incl_prior` | bool | Whether or not to include prior probabilities in the model. Defaults to True. | -| `incl_context` | bool | Whether or not to include the local context in the model (if not: only prior probabilites are used). Defaults to True. | +| `hidden_width` | int | Width of the hidden layer of the entity linking model, defaults to `128`. | +| `incl_prior` | bool | Whether or not to include prior probabilities in the model. Defaults to `True`. | +| `incl_context` | bool | Whether or not to include the local context in the model (if not: only prior probabilities are used). Defaults to `True`. | | **RETURNS** | `EntityLinker` | The newly constructed object. | ## EntityLinker.\_\_call\_\_ {#call tag="method"} diff --git a/website/docs/api/entityruler.md b/website/docs/api/entityruler.md index 5b93fceac..607cb28ce 100644 --- a/website/docs/api/entityruler.md +++ b/website/docs/api/entityruler.md @@ -12,7 +12,7 @@ used on its own to implement a purely rule-based entity recognition system. After initialization, the component is typically added to the processing pipeline using [`nlp.add_pipe`](/api/language#add_pipe). For usage examples, see the docs on -[rule-based entity recogntion](/usage/rule-based-matching#entityruler). +[rule-based entity recognition](/usage/rule-based-matching#entityruler). ## EntityRuler.\_\_init\_\_ {#init tag="method"} diff --git a/website/docs/api/lookups.md b/website/docs/api/lookups.md index 9878546ea..bd3b38303 100644 --- a/website/docs/api/lookups.md +++ b/website/docs/api/lookups.md @@ -6,7 +6,7 @@ source: spacy/lookups.py new: 2.2 --- -This class allows convenient accesss to large lookup tables and dictionaries, +This class allows convenient access to large lookup tables and dictionaries, e.g. lemmatization data or tokenizer exception lists using Bloom filters. Lookups are available via the [`Vocab`](/api/vocab) as `vocab.lookups`, so they can be accessed before the pipeline components are applied (e.g. in the diff --git a/website/docs/usage/spacy-101.md b/website/docs/usage/spacy-101.md index 4bfecb3a9..306186870 100644 --- a/website/docs/usage/spacy-101.md +++ b/website/docs/usage/spacy-101.md @@ -391,7 +391,7 @@ To support the entity linking task, spaCy stores external knowledge in a its data efficiently. > - **Mention**: A textual occurrence of a named entity, e.g. 'Miss Lovelace'. -> - **KB ID**: A unique identifier refering to a particular real-world concept, +> - **KB ID**: A unique identifier referring to a particular real-world concept, > e.g. 'Q7259'. > - **Alias**: A plausible synonym or description for a certain KB ID, e.g. 'Ada > Lovelace'. diff --git a/website/docs/usage/training.md b/website/docs/usage/training.md index f84fd0ed4..948138f91 100644 --- a/website/docs/usage/training.md +++ b/website/docs/usage/training.md @@ -625,7 +625,7 @@ https://github.com/explosion/spaCy/tree/master/examples/training/pretrain_kb.py 2. **Pretrain the entity embeddings** by running the descriptions of the entities through a simple encoder-decoder network. The current implementation requires the `nlp` model to have access to pre-trained word embeddings, but a - custom implementation of this enoding step can also be used. + custom implementation of this encoding step can also be used. 3. **Construct the KB** by defining all entities with their pretrained vectors, and all aliases with their prior probabilities. 4. **Save** the KB using [`kb.dump`](/api/kb#dump).