Update docs [ci skip]

This commit is contained in:
Ines Montani 2020-08-07 16:13:13 +02:00
parent 6f3649923c
commit b7e34c1451
3 changed files with 16 additions and 18 deletions

View File

@ -27,12 +27,12 @@ lemmatizers, see the
> nlp.add_pipe("lemmatizer", config=config)
> ```
| Setting | Type | Description | Default |
| ----------- | ------------------------------------------ | ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------ | ---------- |
| `mode` | str | The lemmatizer mode, e.g. "lookup" or "rule". | `"lookup"` |
| `lookups` | [`Lookups`](/api/lookups) | The lookups object containing the tables such as "lemma_rules", "lemma_index", "lemma_exc" and "lemma_lookup". If `None`, default tables are loaded from `spacy-lookups-data`. | `None` |
| `overwrite` | bool | Whether to overwrite existing lemmas. | `False` |
| `model` | [`Model`](https://thinc.ai/docs/api-model) | **Not yet implemented:** the model to use. | `None` |
| Setting | Type | Description | Default |
| ----------- | ------------------------------------------ | -------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | ---------- |
| `mode` | str | The lemmatizer mode, e.g. "lookup" or "rule". | `"lookup"` |
| `lookups` | [`Lookups`](/api/lookups) | The lookups object containing the tables such as `"lemma_rules"`, `"lemma_index"`, `"lemma_exc"` and `"lemma_lookup"`. If `None`, default tables are loaded from `spacy-lookups-data`. | `None` |
| `overwrite` | bool | Whether to overwrite existing lemmas. | `False` |
| `model` | [`Model`](https://thinc.ai/docs/api-model) | **Not yet implemented:** the model to use. | `None` |
```python
https://github.com/explosion/spaCy/blob/develop/spacy/pipeline/lemmatizer.py

View File

@ -12,14 +12,15 @@ passed on to the next component.
> - **Creates:** Objects, attributes and properties modified and set by the
> component.
| Name | Component | Creates | Description |
| ------------- | ------------------------------------------------------------------ | ----------------------------------------------------------- | ------------------------------------------------ |
| **tokenizer** | [`Tokenizer`](/api/tokenizer) | `Doc` | Segment text into tokens. |
| **tagger** | [`Tagger`](/api/tagger) | `Doc[i].tag` | Assign part-of-speech tags. |
| **parser** | [`DependencyParser`](/api/dependencyparser) | `Doc[i].head`, `Doc[i].dep`, `Doc.sents`, `Doc.noun_chunks` | Assign dependency labels. |
| **ner** | [`EntityRecognizer`](/api/entityrecognizer) | `Doc.ents`, `Doc[i].ent_iob`, `Doc[i].ent_type` | Detect and label named entities. |
| **textcat** | [`TextCategorizer`](/api/textcategorizer) | `Doc.cats` | Assign document labels. |
| ... | [custom components](/usage/processing-pipelines#custom-components) | `Doc._.xxx`, `Token._.xxx`, `Span._.xxx` | Assign custom attributes, methods or properties. |
| Name | Component | Creates | Description |
| -------------- | ------------------------------------------------------------------ | --------------------------------------------------------- | ------------------------------------------------ |
| **tokenizer** | [`Tokenizer`](/api/tokenizer) | `Doc` | Segment text into tokens. |
| **tagger** | [`Tagger`](/api/tagger) | `Token.tag` | Assign part-of-speech tags. |
| **parser** | [`DependencyParser`](/api/dependencyparser) | `Token.head`, `Token.dep`, `Doc.sents`, `Doc.noun_chunks` | Assign dependency labels. |
| **ner** | [`EntityRecognizer`](/api/entityrecognizer) | `Doc.ents`, `Token.ent_iob`, `Token.ent_type` | Detect and label named entities. |
| **lemmatizer** | [`Lemmatizer`](/api/lemmatizer) | `Token.lemma` | Assign base forms. |
| **textcat** | [`TextCategorizer`](/api/textcategorizer) | `Doc.cats` | Assign document labels. |
| ... | [custom components](/usage/processing-pipelines#custom-components) | `Doc._.xxx`, `Token._.xxx`, `Span._.xxx` | Assign custom attributes, methods or properties. |
The processing pipeline always **depends on the statistical model** and its
capabilities. For example, a pipeline can only include an entity recognizer

View File

@ -228,16 +228,13 @@ available pipeline components and component functions.
| `entity_linker` | [`EntityLinker`](/api/entitylinker) | Assign knowledge base IDs to named entities. Should be added after the entity recognizer. |
| `entity_ruler` | [`EntityRuler`](/api/entityruler) | Assign named entities based on pattern rules and dictionaries. |
| `textcat` | [`TextCategorizer`](/api/textcategorizer) | Assign text categories. |
| `lemmatizer` | [`Lemmatizer`](/api/lemmatizer) | Assign base forms to words. |
| `morphologizer` | [`Morphologizer`](/api/morphologizer) | Assign morphological features and coarse-grained POS tags. |
| `senter` | [`SentenceRecognizer`](/api/sentencerecognizer) | Assign sentence boundaries. |
| `sentencizer` | [`Sentencizer`](/api/sentencizer) | Add rule-based sentence segmentation without the dependency parse. |
| `tok2vec` | [`Tok2Vec`](/api/tok2vec) | |
| `transformer` | [`Transformer`](/api/transformer) | Assign the tokens and outputs of a transformer model. |
<!-- TODO: finish and update with more components -->
<!-- TODO: explain default config and factories -->
### Disabling and modifying pipeline components {#disabling}
If you don't need a particular component of the pipeline for example, the