Fix typos and formatting [ci skip]

This commit is contained in:
Ines Montani 2019-10-01 12:30:04 +02:00
parent ca0b20ae8b
commit 932ad9cb91
6 changed files with 8 additions and 8 deletions

View File

@ -501,7 +501,7 @@ entities:
> than the **BILUO** scheme that we use, which explicitly marks boundary tokens. > than the **BILUO** scheme that we use, which explicitly marks boundary tokens.
spaCy translates the character offsets into this scheme, in order to decide the spaCy translates the character offsets into this scheme, in order to decide the
cost of each action given the current state of the entity recogniser. The costs cost of each action given the current state of the entity recognizer. The costs
are then used to calculate the gradient of the loss, to train the model. The are then used to calculate the gradient of the loss, to train the model. The
exact algorithm is a pastiche of well-known methods, and is not currently exact algorithm is a pastiche of well-known methods, and is not currently
described in any single publication. The model is a greedy transition-based described in any single publication. The model is a greedy transition-based

View File

@ -46,9 +46,9 @@ shortcut for this and instantiate the component using its string name and
| -------------- | ----------------------------- | ----------------------------------------------------------------------------------------------------------------------------------------------------- | | -------------- | ----------------------------- | ----------------------------------------------------------------------------------------------------------------------------------------------------- |
| `vocab` | `Vocab` | The shared vocabulary. | | `vocab` | `Vocab` | The shared vocabulary. |
| `model` | `thinc.neural.Model` / `True` | The model powering the pipeline component. If no model is supplied, the model is created when you call `begin_training`, `from_disk` or `from_bytes`. | | `model` | `thinc.neural.Model` / `True` | The model powering the pipeline component. If no model is supplied, the model is created when you call `begin_training`, `from_disk` or `from_bytes`. |
| `hidden_width` | int | Width of the hidden layer of the entity linking model, defaults to 128. | | `hidden_width` | int | Width of the hidden layer of the entity linking model, defaults to `128`. |
| `incl_prior` | bool | Whether or not to include prior probabilities in the model. Defaults to True. | | `incl_prior` | bool | Whether or not to include prior probabilities in the model. Defaults to `True`. |
| `incl_context` | bool | Whether or not to include the local context in the model (if not: only prior probabilites are used). Defaults to True. | | `incl_context` | bool | Whether or not to include the local context in the model (if not: only prior probabilities are used). Defaults to `True`. |
| **RETURNS** | `EntityLinker` | The newly constructed object. | | **RETURNS** | `EntityLinker` | The newly constructed object. |
## EntityLinker.\_\_call\_\_ {#call tag="method"} ## EntityLinker.\_\_call\_\_ {#call tag="method"}

View File

@ -12,7 +12,7 @@ used on its own to implement a purely rule-based entity recognition system.
After initialization, the component is typically added to the processing After initialization, the component is typically added to the processing
pipeline using [`nlp.add_pipe`](/api/language#add_pipe). For usage examples, see pipeline using [`nlp.add_pipe`](/api/language#add_pipe). For usage examples, see
the docs on the docs on
[rule-based entity recogntion](/usage/rule-based-matching#entityruler). [rule-based entity recognition](/usage/rule-based-matching#entityruler).
## EntityRuler.\_\_init\_\_ {#init tag="method"} ## EntityRuler.\_\_init\_\_ {#init tag="method"}

View File

@ -6,7 +6,7 @@ source: spacy/lookups.py
new: 2.2 new: 2.2
--- ---
This class allows convenient accesss to large lookup tables and dictionaries, This class allows convenient access to large lookup tables and dictionaries,
e.g. lemmatization data or tokenizer exception lists using Bloom filters. e.g. lemmatization data or tokenizer exception lists using Bloom filters.
Lookups are available via the [`Vocab`](/api/vocab) as `vocab.lookups`, so they Lookups are available via the [`Vocab`](/api/vocab) as `vocab.lookups`, so they
can be accessed before the pipeline components are applied (e.g. in the can be accessed before the pipeline components are applied (e.g. in the

View File

@ -391,7 +391,7 @@ To support the entity linking task, spaCy stores external knowledge in a
its data efficiently. its data efficiently.
> - **Mention**: A textual occurrence of a named entity, e.g. 'Miss Lovelace'. > - **Mention**: A textual occurrence of a named entity, e.g. 'Miss Lovelace'.
> - **KB ID**: A unique identifier refering to a particular real-world concept, > - **KB ID**: A unique identifier referring to a particular real-world concept,
> e.g. 'Q7259'. > e.g. 'Q7259'.
> - **Alias**: A plausible synonym or description for a certain KB ID, e.g. 'Ada > - **Alias**: A plausible synonym or description for a certain KB ID, e.g. 'Ada
> Lovelace'. > Lovelace'.

View File

@ -625,7 +625,7 @@ https://github.com/explosion/spaCy/tree/master/examples/training/pretrain_kb.py
2. **Pretrain the entity embeddings** by running the descriptions of the 2. **Pretrain the entity embeddings** by running the descriptions of the
entities through a simple encoder-decoder network. The current implementation entities through a simple encoder-decoder network. The current implementation
requires the `nlp` model to have access to pre-trained word embeddings, but a requires the `nlp` model to have access to pre-trained word embeddings, but a
custom implementation of this enoding step can also be used. custom implementation of this encoding step can also be used.
3. **Construct the KB** by defining all entities with their pretrained vectors, 3. **Construct the KB** by defining all entities with their pretrained vectors,
and all aliases with their prior probabilities. and all aliases with their prior probabilities.
4. **Save** the KB using [`kb.dump`](/api/kb#dump). 4. **Save** the KB using [`kb.dump`](/api/kb#dump).