mirror of
https://github.com/explosion/spaCy.git
synced 2024-12-25 17:36:30 +03:00
Fix typos and formatting [ci skip]
This commit is contained in:
parent
ca0b20ae8b
commit
932ad9cb91
|
@ -501,7 +501,7 @@ entities:
|
|||
> than the **BILUO** scheme that we use, which explicitly marks boundary tokens.
|
||||
|
||||
spaCy translates the character offsets into this scheme, in order to decide the
|
||||
cost of each action given the current state of the entity recogniser. The costs
|
||||
cost of each action given the current state of the entity recognizer. The costs
|
||||
are then used to calculate the gradient of the loss, to train the model. The
|
||||
exact algorithm is a pastiche of well-known methods, and is not currently
|
||||
described in any single publication. The model is a greedy transition-based
|
||||
|
|
|
@ -46,9 +46,9 @@ shortcut for this and instantiate the component using its string name and
|
|||
| -------------- | ----------------------------- | ----------------------------------------------------------------------------------------------------------------------------------------------------- |
|
||||
| `vocab` | `Vocab` | The shared vocabulary. |
|
||||
| `model` | `thinc.neural.Model` / `True` | The model powering the pipeline component. If no model is supplied, the model is created when you call `begin_training`, `from_disk` or `from_bytes`. |
|
||||
| `hidden_width` | int | Width of the hidden layer of the entity linking model, defaults to 128. |
|
||||
| `incl_prior` | bool | Whether or not to include prior probabilities in the model. Defaults to True. |
|
||||
| `incl_context` | bool | Whether or not to include the local context in the model (if not: only prior probabilites are used). Defaults to True. |
|
||||
| `hidden_width` | int | Width of the hidden layer of the entity linking model, defaults to `128`. |
|
||||
| `incl_prior` | bool | Whether or not to include prior probabilities in the model. Defaults to `True`. |
|
||||
| `incl_context` | bool | Whether or not to include the local context in the model (if not: only prior probabilities are used). Defaults to `True`. |
|
||||
| **RETURNS** | `EntityLinker` | The newly constructed object. |
|
||||
|
||||
## EntityLinker.\_\_call\_\_ {#call tag="method"}
|
||||
|
|
|
@ -12,7 +12,7 @@ used on its own to implement a purely rule-based entity recognition system.
|
|||
After initialization, the component is typically added to the processing
|
||||
pipeline using [`nlp.add_pipe`](/api/language#add_pipe). For usage examples, see
|
||||
the docs on
|
||||
[rule-based entity recogntion](/usage/rule-based-matching#entityruler).
|
||||
[rule-based entity recognition](/usage/rule-based-matching#entityruler).
|
||||
|
||||
## EntityRuler.\_\_init\_\_ {#init tag="method"}
|
||||
|
||||
|
|
|
@ -6,7 +6,7 @@ source: spacy/lookups.py
|
|||
new: 2.2
|
||||
---
|
||||
|
||||
This class allows convenient accesss to large lookup tables and dictionaries,
|
||||
This class allows convenient access to large lookup tables and dictionaries,
|
||||
e.g. lemmatization data or tokenizer exception lists using Bloom filters.
|
||||
Lookups are available via the [`Vocab`](/api/vocab) as `vocab.lookups`, so they
|
||||
can be accessed before the pipeline components are applied (e.g. in the
|
||||
|
|
|
@ -391,7 +391,7 @@ To support the entity linking task, spaCy stores external knowledge in a
|
|||
its data efficiently.
|
||||
|
||||
> - **Mention**: A textual occurrence of a named entity, e.g. 'Miss Lovelace'.
|
||||
> - **KB ID**: A unique identifier refering to a particular real-world concept,
|
||||
> - **KB ID**: A unique identifier referring to a particular real-world concept,
|
||||
> e.g. 'Q7259'.
|
||||
> - **Alias**: A plausible synonym or description for a certain KB ID, e.g. 'Ada
|
||||
> Lovelace'.
|
||||
|
|
|
@ -625,7 +625,7 @@ https://github.com/explosion/spaCy/tree/master/examples/training/pretrain_kb.py
|
|||
2. **Pretrain the entity embeddings** by running the descriptions of the
|
||||
entities through a simple encoder-decoder network. The current implementation
|
||||
requires the `nlp` model to have access to pre-trained word embeddings, but a
|
||||
custom implementation of this enoding step can also be used.
|
||||
custom implementation of this encoding step can also be used.
|
||||
3. **Construct the KB** by defining all entities with their pretrained vectors,
|
||||
and all aliases with their prior probabilities.
|
||||
4. **Save** the KB using [`kb.dump`](/api/kb#dump).
|
||||
|
|
Loading…
Reference in New Issue
Block a user