mirror of
https://github.com/explosion/spaCy.git
synced 2025-01-14 03:26:24 +03:00
Fix typos and formatting [ci skip]
This commit is contained in:
parent
ca0b20ae8b
commit
932ad9cb91
|
@ -501,7 +501,7 @@ entities:
|
||||||
> than the **BILUO** scheme that we use, which explicitly marks boundary tokens.
|
> than the **BILUO** scheme that we use, which explicitly marks boundary tokens.
|
||||||
|
|
||||||
spaCy translates the character offsets into this scheme, in order to decide the
|
spaCy translates the character offsets into this scheme, in order to decide the
|
||||||
cost of each action given the current state of the entity recogniser. The costs
|
cost of each action given the current state of the entity recognizer. The costs
|
||||||
are then used to calculate the gradient of the loss, to train the model. The
|
are then used to calculate the gradient of the loss, to train the model. The
|
||||||
exact algorithm is a pastiche of well-known methods, and is not currently
|
exact algorithm is a pastiche of well-known methods, and is not currently
|
||||||
described in any single publication. The model is a greedy transition-based
|
described in any single publication. The model is a greedy transition-based
|
||||||
|
|
|
@ -46,9 +46,9 @@ shortcut for this and instantiate the component using its string name and
|
||||||
| -------------- | ----------------------------- | ----------------------------------------------------------------------------------------------------------------------------------------------------- |
|
| -------------- | ----------------------------- | ----------------------------------------------------------------------------------------------------------------------------------------------------- |
|
||||||
| `vocab` | `Vocab` | The shared vocabulary. |
|
| `vocab` | `Vocab` | The shared vocabulary. |
|
||||||
| `model` | `thinc.neural.Model` / `True` | The model powering the pipeline component. If no model is supplied, the model is created when you call `begin_training`, `from_disk` or `from_bytes`. |
|
| `model` | `thinc.neural.Model` / `True` | The model powering the pipeline component. If no model is supplied, the model is created when you call `begin_training`, `from_disk` or `from_bytes`. |
|
||||||
| `hidden_width` | int | Width of the hidden layer of the entity linking model, defaults to 128. |
|
| `hidden_width` | int | Width of the hidden layer of the entity linking model, defaults to `128`. |
|
||||||
| `incl_prior` | bool | Whether or not to include prior probabilities in the model. Defaults to True. |
|
| `incl_prior` | bool | Whether or not to include prior probabilities in the model. Defaults to `True`. |
|
||||||
| `incl_context` | bool | Whether or not to include the local context in the model (if not: only prior probabilites are used). Defaults to True. |
|
| `incl_context` | bool | Whether or not to include the local context in the model (if not: only prior probabilities are used). Defaults to `True`. |
|
||||||
| **RETURNS** | `EntityLinker` | The newly constructed object. |
|
| **RETURNS** | `EntityLinker` | The newly constructed object. |
|
||||||
|
|
||||||
## EntityLinker.\_\_call\_\_ {#call tag="method"}
|
## EntityLinker.\_\_call\_\_ {#call tag="method"}
|
||||||
|
|
|
@ -12,7 +12,7 @@ used on its own to implement a purely rule-based entity recognition system.
|
||||||
After initialization, the component is typically added to the processing
|
After initialization, the component is typically added to the processing
|
||||||
pipeline using [`nlp.add_pipe`](/api/language#add_pipe). For usage examples, see
|
pipeline using [`nlp.add_pipe`](/api/language#add_pipe). For usage examples, see
|
||||||
the docs on
|
the docs on
|
||||||
[rule-based entity recogntion](/usage/rule-based-matching#entityruler).
|
[rule-based entity recognition](/usage/rule-based-matching#entityruler).
|
||||||
|
|
||||||
## EntityRuler.\_\_init\_\_ {#init tag="method"}
|
## EntityRuler.\_\_init\_\_ {#init tag="method"}
|
||||||
|
|
||||||
|
|
|
@ -6,7 +6,7 @@ source: spacy/lookups.py
|
||||||
new: 2.2
|
new: 2.2
|
||||||
---
|
---
|
||||||
|
|
||||||
This class allows convenient accesss to large lookup tables and dictionaries,
|
This class allows convenient access to large lookup tables and dictionaries,
|
||||||
e.g. lemmatization data or tokenizer exception lists using Bloom filters.
|
e.g. lemmatization data or tokenizer exception lists using Bloom filters.
|
||||||
Lookups are available via the [`Vocab`](/api/vocab) as `vocab.lookups`, so they
|
Lookups are available via the [`Vocab`](/api/vocab) as `vocab.lookups`, so they
|
||||||
can be accessed before the pipeline components are applied (e.g. in the
|
can be accessed before the pipeline components are applied (e.g. in the
|
||||||
|
|
|
@ -391,7 +391,7 @@ To support the entity linking task, spaCy stores external knowledge in a
|
||||||
its data efficiently.
|
its data efficiently.
|
||||||
|
|
||||||
> - **Mention**: A textual occurrence of a named entity, e.g. 'Miss Lovelace'.
|
> - **Mention**: A textual occurrence of a named entity, e.g. 'Miss Lovelace'.
|
||||||
> - **KB ID**: A unique identifier refering to a particular real-world concept,
|
> - **KB ID**: A unique identifier referring to a particular real-world concept,
|
||||||
> e.g. 'Q7259'.
|
> e.g. 'Q7259'.
|
||||||
> - **Alias**: A plausible synonym or description for a certain KB ID, e.g. 'Ada
|
> - **Alias**: A plausible synonym or description for a certain KB ID, e.g. 'Ada
|
||||||
> Lovelace'.
|
> Lovelace'.
|
||||||
|
|
|
@ -625,7 +625,7 @@ https://github.com/explosion/spaCy/tree/master/examples/training/pretrain_kb.py
|
||||||
2. **Pretrain the entity embeddings** by running the descriptions of the
|
2. **Pretrain the entity embeddings** by running the descriptions of the
|
||||||
entities through a simple encoder-decoder network. The current implementation
|
entities through a simple encoder-decoder network. The current implementation
|
||||||
requires the `nlp` model to have access to pre-trained word embeddings, but a
|
requires the `nlp` model to have access to pre-trained word embeddings, but a
|
||||||
custom implementation of this enoding step can also be used.
|
custom implementation of this encoding step can also be used.
|
||||||
3. **Construct the KB** by defining all entities with their pretrained vectors,
|
3. **Construct the KB** by defining all entities with their pretrained vectors,
|
||||||
and all aliases with their prior probabilities.
|
and all aliases with their prior probabilities.
|
||||||
4. **Save** the KB using [`kb.dump`](/api/kb#dump).
|
4. **Save** the KB using [`kb.dump`](/api/kb#dump).
|
||||||
|
|
Loading…
Reference in New Issue
Block a user