Update docs [ci skip]

2025-07-31 02:19:46 +03:00 · 2020-10-09 10:36:06 +02:00 · 2020-10-09 10:36:06 +02:00 · 329b61ee7b
commit 329b61ee7b
parent 67652bcbb5
5 changed files with 27 additions and 30 deletions
--- a/website/docs/api/attributeruler.md
+++ b/website/docs/api/attributeruler.md
@ -4,7 +4,6 @@ tag: class
 source: spacy/pipeline/attributeruler.py
 new: 3
 teaser: 'Pipeline component for rule-based token attribute assignment'
-api_base_class: /api/pipe
 api_string_name: attribute_ruler
 api_trainable: false
 ---
--- a/website/docs/api/pipe.md
+++ b/website/docs/api/pipe.md
@ -14,7 +14,7 @@ for how to use the `TrainablePipe` base class to implement custom components.

 <!-- TODO: Pipe vs TrainablePipe, check methods below (all renamed to TrainablePipe for now) -->

-> #### Why is TrainablePipe implemented in Cython?
+> #### Why is it implemented in Cython?
 >
 > The `TrainablePipe` class is implemented in a `.pyx` module, the extension
 > used by [Cython](/api/cython). This is needed so that **other** Cython
--- a/website/docs/api/sentencizer.md
+++ b/website/docs/api/sentencizer.md
@ -3,7 +3,6 @@ title: Sentencizer
 tag: class
 source: spacy/pipeline/sentencizer.pyx
 teaser: 'Pipeline component for rule-based sentence boundary detection'
-api_base_class: /api/pipe
 api_string_name: sentencizer
 api_trainable: false
 ---
@ -130,9 +129,9 @@ Score a batch of examples.

 ## Sentencizer.to_disk {#to_disk tag="method"}

-Save the sentencizer settings (punctuation characters) to a directory. Will create
-a file `sentencizer.json`. This also happens automatically when you save an
-`nlp` object with a sentencizer added to its pipeline.
+Save the sentencizer settings (punctuation characters) to a directory. Will
+create a file `sentencizer.json`. This also happens automatically when you save
+an `nlp` object with a sentencizer added to its pipeline.

 > #### Example
 >
--- a/website/docs/usage/101/_architecture.md
+++ b/website/docs/usage/101/_architecture.md
@ -20,13 +20,13 @@ It also orchestrates training and serialization.

 | Name                        | Description                                                                                                                                             |
 | --------------------------- | ------------------------------------------------------------------------------------------------------------------------------------------------------- |
-| [`Language`](/api/language) | Processing class that turns text into `Doc` objects. Different languages implement their own subclasses of it. The variable is typically called `nlp`.  |
 | [`Doc`](/api/doc)           | A container for accessing linguistic annotations.                                                                                                       |
+| [`DocBin`](/api/docbin)     | A collection of `Doc` objects for efficient binary serialization. Also used for [training data](/api/data-formats#binary-training).                     |
+| [`Example`](/api/example)   | A collection of training annotations, containing two `Doc` objects: the reference data and the predictions.                                             |
+| [`Language`](/api/language) | Processing class that turns text into `Doc` objects. Different languages implement their own subclasses of it. The variable is typically called `nlp`.  |
+| [`Lexeme`](/api/lexeme)     | An entry in the vocabulary. It's a word type with no context, as opposed to a word token. It therefore has no part-of-speech tag, dependency parse etc. |
 | [`Span`](/api/span)         | A slice from a `Doc` object.                                                                                                                            |
 | [`Token`](/api/token)       | An individual token — i.e. a word, punctuation symbol, whitespace, etc.                                                                                 |
-| [`Lexeme`](/api/lexeme)     | An entry in the vocabulary. It's a word type with no context, as opposed to a word token. It therefore has no part-of-speech tag, dependency parse etc. |
-| [`Example`](/api/example)   | A collection of training annotations, containing two `Doc` objects: the reference data and the predictions.                                             |
-| [`DocBin`](/api/docbin)     | A collection of `Doc` objects for efficient binary serialization. Also used for [training data](/api/data-formats#binary-training).                     |

 ### Processing pipeline {#architecture-pipeline}

@ -42,23 +42,22 @@ components for different language processing tasks and also allows adding

 | Name                                            | Description                                                                                 |
 | ----------------------------------------------- | ------------------------------------------------------------------------------------------- |
-| [`Tokenizer`](/api/tokenizer)                   | Segment raw text and create `Doc` objects from the words.                                   |
-| [`Tok2Vec`](/api/tok2vec)                       | Apply a "token-to-vector" model and set its outputs.                                        |
-| [`Transformer`](/api/transformer)               | Use a transformer model and set its outputs.                                                |
-| [`Lemmatizer`](/api/lemmatizer)                 | Determine the base forms of words.                                                          |
-| [`Morphologizer`](/api/morphologizer)           | Predict morphological features and coarse-grained part-of-speech tags.                      |
-| [`Tagger`](/api/tagger)                         | Predict part-of-speech tags.                                                                |
 | [`AttributeRuler`](/api/attributeruler)         | Set token attributes using matcher rules.                                                   |
 | [`DependencyParser`](/api/dependencyparser)     | Predict syntactic dependencies.                                                             |
+| [`EntityLinker`](/api/entitylinker)             | Disambiguate named entities to nodes in a knowledge base.                                   |
 | [`EntityRecognizer`](/api/entityrecognizer)     | Predict named entities, e.g. persons or products.                                           |
 | [`EntityRuler`](/api/entityruler)               | Add entity spans to the `Doc` using token-based rules or exact phrase matches.              |
-| [`EntityLinker`](/api/entitylinker)             | Disambiguate named entities to nodes in a knowledge base.                                   |
-| [`TextCategorizer`](/api/textcategorizer)       | Predict categories or labels over the whole document.                                       |
-| [`Sentencizer`](/api/sentencizer)               | Implement rule-based sentence boundary detection that doesn't require the dependency parse. |
+| [`Lemmatizer`](/api/lemmatizer)                 | Determine the base forms of words.                                                          |
+| [`Morphologizer`](/api/morphologizer)           | Predict morphological features and coarse-grained part-of-speech tags.                      |
 | [`SentenceRecognizer`](/api/sentencerecognizer) | Predict sentence boundaries.                                                                |
-| [Other functions](/api/pipeline-functions)      | Automatically apply something to the `Doc`, e.g. to merge spans of tokens.                  |
-| [`Pipe`](/api/pipe)                             | Base class that pipeline components may inherit from.                                       |
+| [`Sentencizer`](/api/sentencizer)               | Implement rule-based sentence boundary detection that doesn't require the dependency parse. |
+| [`Tagger`](/api/tagger)                         | Predict part-of-speech tags.                                                                |
+| [`TextCategorizer`](/api/textcategorizer)       | Predict categories or labels over the whole document.                                       |
+| [`Tok2Vec`](/api/tok2vec)                       | Apply a "token-to-vector" model and set its outputs.                                        |
+| [`Tokenizer`](/api/tokenizer)                   | Segment raw text and create `Doc` objects from the words.                                   |
 | [`TrainablePipe`](/api/pipe)                    | Class that all trainable pipeline components inherit from.                                  |
+| [`Transformer`](/api/transformer)               | Use a transformer model and set its outputs.                                                |
+| [Other functions](/api/pipeline-functions)      | Automatically apply something to the `Doc`, e.g. to merge spans of tokens.                  |

 ### Matchers {#architecture-matchers}

@ -68,20 +67,20 @@ operates on a `Doc` and gives you access to the matched tokens **in context**.

 | Name                                          | Description                                                                                                                                                                        |
 | --------------------------------------------- | ---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
+| [`DependencyMatcher`](/api/dependencymatcher) | Match sequences of tokens based on dependency trees using [Semgrex operators](https://nlp.stanford.edu/nlp/javadoc/javanlp/edu/stanford/nlp/semgraph/semgrex/SemgrexPattern.html). |
 | [`Matcher`](/api/matcher)                     | Match sequences of tokens, based on pattern rules, similar to regular expressions.                                                                                                 |
 | [`PhraseMatcher`](/api/phrasematcher)         | Match sequences of tokens based on phrases.                                                                                                                                        |
-| [`DependencyMatcher`](/api/dependencymatcher) | Match sequences of tokens based on dependency trees using [Semgrex operators](https://nlp.stanford.edu/nlp/javadoc/javanlp/edu/stanford/nlp/semgraph/semgrex/SemgrexPattern.html). |

 ### Other classes {#architecture-other}

 | Name                                             | Description                                                                                        |
 | ------------------------------------------------ | -------------------------------------------------------------------------------------------------- |
-| [`Vocab`](/api/vocab)                            | The shared vocabulary that stores strings and gives you access to [`Lexeme`](/api/lexeme) objects. |
+| [`Corpus`](/api/corpus)                          | Class for managing annotated corpora for training and evaluation data.                             |
+| [`KnowledgeBase`](/api/kb)                       | Storage for entities and aliases of a knowledge base for entity linking.                           |
+| [`Lookups`](/api/lookups)                        | Container for convenient access to large lookup tables and dictionaries.                           |
+| [`MorphAnalysis`](/api/morphology#morphanalysis) | A morphological analysis.                                                                          |
+| [`Morphology`](/api/morphology)                  | Store morphological analyses and map them to and from hash values.                                 |
+| [`Scorer`](/api/scorer)                          | Compute evaluation scores.                                                                         |
 | [`StringStore`](/api/stringstore)                | Map strings to and from hash values.                                                               |
 | [`Vectors`](/api/vectors)                        | Container class for vector data keyed by string.                                                   |
-| [`Lookups`](/api/lookups)                        | Container for convenient access to large lookup tables and dictionaries.                           |
-| [`Morphology`](/api/morphology)                  | Store morphological analyses and map them to and from hash values.                                 |
-| [`MorphAnalysis`](/api/morphology#morphanalysis) | A morphological analysis.                                                                          |
-| [`KnowledgeBase`](/api/kb)                       | Storage for entities and aliases of a knowledge base for entity linking.                           |
-| [`Scorer`](/api/scorer)                          | Compute evaluation scores.                                                                         |
-| [`Corpus`](/api/corpus)                          | Class for managing annotated corpora for training and evaluation data.                             |
+| [`Vocab`](/api/vocab)                            | The shared vocabulary that stores strings and gives you access to [`Lexeme`](/api/lexeme) objects. |
--- a/website/meta/sidebars.json
+++ b/website/meta/sidebars.json
@ -94,13 +94,13 @@
                    { "text": "EntityRuler", "url": "/api/entityruler" },
                    { "text": "Lemmatizer", "url": "/api/lemmatizer" },
                    { "text": "Morphologizer", "url": "/api/morphologizer" },
-                    { "text": "Pipe", "url": "/api/pipe" },
                    { "text": "SentenceRecognizer", "url": "/api/sentencerecognizer" },
                    { "text": "Sentencizer", "url": "/api/sentencizer" },
                    { "text": "Tagger", "url": "/api/tagger" },
                    { "text": "TextCategorizer", "url": "/api/textcategorizer" },
                    { "text": "Tok2Vec", "url": "/api/tok2vec" },
                    { "text": "Tokenizer", "url": "/api/tokenizer" },
+                    { "text": "TrainablePipe", "url": "/api/pipe" },
                    { "text": "Transformer", "url": "/api/transformer" },
                    { "text": "Other Functions", "url": "/api/pipeline-functions" }
                ]