Improve built-in component API docs

2025-08-02 19:30:19 +03:00 · 2019-02-24 13:11:49 +01:00 · 2019-02-24 13:11:49 +01:00 · c03cb1cc63
commit c03cb1cc63
parent 235a0e948e
4 changed files with 51 additions and 40 deletions
--- a/website/docs/api/dependencyparser.md
+++ b/website/docs/api/dependencyparser.md
@ -37,17 +37,19 @@ shortcut for this and instantiate the component using its string name and
 > parser.from_disk("/path/to/model")
 > ```

-| Name        | Type                           | Description                                                                                                                                           |
-| ----------- | ------------------------------ | ----------------------------------------------------------------------------------------------------------------------------------------------------- |
-| `vocab`     | `Vocab`                        | The shared vocabulary.                                                                                                                                |
-| `model`     | `thinc.neural.Model` or `True` | The model powering the pipeline component. If no model is supplied, the model is created when you call `begin_training`, `from_disk` or `from_bytes`. |
-| `**cfg`     | -                              | Configuration parameters.                                                                                                                             |
-| **RETURNS** | `DependencyParser`             | The newly constructed object.                                                                                                                         |
+| Name        | Type                          | Description                                                                                                                                           |
+| ----------- | ----------------------------- | ----------------------------------------------------------------------------------------------------------------------------------------------------- |
+| `vocab`     | `Vocab`                       | The shared vocabulary.                                                                                                                                |
+| `model`     | `thinc.neural.Model` / `True` | The model powering the pipeline component. If no model is supplied, the model is created when you call `begin_training`, `from_disk` or `from_bytes`. |
+| `**cfg`     | -                             | Configuration parameters.                                                                                                                             |
+| **RETURNS** | `DependencyParser`            | The newly constructed object.                                                                                                                         |

 ## DependencyParser.\_\_call\_\_ {#call tag="method"}

 Apply the pipe to one document. The document is modified in place, and returned.
-Both [`__call__`](/api/dependencyparser#call) and
+This usually happens under the hood when you call the `nlp` object on a text and
+all pipeline components are applied to the `Doc` in order. Both
+[`__call__`](/api/dependencyparser#call) and
 [`pipe`](/api/dependencyparser#pipe) delegate to the
 [`predict`](/api/dependencyparser#predict) and
 [`set_annotations`](/api/dependencyparser#set_annotations) methods.
@ -57,6 +59,7 @@ Both [`__call__`](/api/dependencyparser#call) and
 > ```python
 > parser = DependencyParser(nlp.vocab)
 > doc = nlp(u"This is a sentence.")
+> # This usually happens under the hood
 > processed = parser(doc)
 > ```

@ -82,11 +85,11 @@ Apply the pipe to a stream of documents. Both
 >     pass
 > ```

-| Name         | Type     | Description                                                                                                    |
-| ------------ | -------- | -------------------------------------------------------------------------------------------------------------- |
-| `stream`     | iterable | A stream of documents.                                                                                         |
-| `batch_size` | int      | The number of texts to buffer. Defaults to `128`.                                                              |
-| **YIELDS**   | `Doc`    | Processed documents in the order of the original text.                                                         |
+| Name         | Type     | Description                                            |
+| ------------ | -------- | ------------------------------------------------------ |
+| `stream`     | iterable | A stream of documents.                                 |
+| `batch_size` | int      | The number of texts to buffer. Defaults to `128`.      |
+| **YIELDS**   | `Doc`    | Processed documents in the order of the original text. |

 ## DependencyParser.predict {#predict tag="method"}

--- a/website/docs/api/entityrecognizer.md
+++ b/website/docs/api/entityrecognizer.md
@ -37,17 +37,19 @@ shortcut for this and instantiate the component using its string name and
 > ner.from_disk("/path/to/model")
 > ```

-| Name        | Type                           | Description                                                                                                                                           |
-| ----------- | ------------------------------ | ----------------------------------------------------------------------------------------------------------------------------------------------------- |
-| `vocab`     | `Vocab`                        | The shared vocabulary.                                                                                                                                |
-| `model`     | `thinc.neural.Model` or `True` | The model powering the pipeline component. If no model is supplied, the model is created when you call `begin_training`, `from_disk` or `from_bytes`. |
-| `**cfg`     | -                              | Configuration parameters.                                                                                                                             |
-| **RETURNS** | `EntityRecognizer`             | The newly constructed object.                                                                                                                         |
+| Name        | Type                          | Description                                                                                                                                           |
+| ----------- | ----------------------------- | ----------------------------------------------------------------------------------------------------------------------------------------------------- |
+| `vocab`     | `Vocab`                       | The shared vocabulary.                                                                                                                                |
+| `model`     | `thinc.neural.Model` / `True` | The model powering the pipeline component. If no model is supplied, the model is created when you call `begin_training`, `from_disk` or `from_bytes`. |
+| `**cfg`     | -                             | Configuration parameters.                                                                                                                             |
+| **RETURNS** | `EntityRecognizer`            | The newly constructed object.                                                                                                                         |

 ## EntityRecognizer.\_\_call\_\_ {#call tag="method"}

 Apply the pipe to one document. The document is modified in place, and returned.
-Both [`__call__`](/api/entityrecognizer#call) and
+This usually happens under the hood when you call the `nlp` object on a text and
+all pipeline components are applied to the `Doc` in order. Both
+[`__call__`](/api/entityrecognizer#call) and
 [`pipe`](/api/entityrecognizer#pipe) delegate to the
 [`predict`](/api/entityrecognizer#predict) and
 [`set_annotations`](/api/entityrecognizer#set_annotations) methods.
@ -57,6 +59,7 @@ Both [`__call__`](/api/entityrecognizer#call) and
 > ```python
 > ner = EntityRecognizer(nlp.vocab)
 > doc = nlp(u"This is a sentence.")
+> # This usually happens under the hood
 > processed = ner(doc)
 > ```

@ -82,11 +85,11 @@ Apply the pipe to a stream of documents. Both
 >     pass
 > ```

-| Name         | Type     | Description                                                                                                    |
-| ------------ | -------- | -------------------------------------------------------------------------------------------------------------- |
-| `stream`     | iterable | A stream of documents.                                                                                         |
-| `batch_size` | int      | The number of texts to buffer. Defaults to `128`.                                                              |
-| **YIELDS**   | `Doc`    | Processed documents in the order of the original text.                                                         |
+| Name         | Type     | Description                                            |
+| ------------ | -------- | ------------------------------------------------------ |
+| `stream`     | iterable | A stream of documents.                                 |
+| `batch_size` | int      | The number of texts to buffer. Defaults to `128`.      |
+| **YIELDS**   | `Doc`    | Processed documents in the order of the original text. |

 ## EntityRecognizer.predict {#predict tag="method"}

--- a/website/docs/api/tagger.md
+++ b/website/docs/api/tagger.md
@ -37,18 +37,20 @@ shortcut for this and instantiate the component using its string name and
 > tagger.from_disk("/path/to/model")
 > ```

-| Name        | Type                           | Description                                                                                                                                           |
-| ----------- | ------------------------------ | ----------------------------------------------------------------------------------------------------------------------------------------------------- |
-| `vocab`     | `Vocab`                        | The shared vocabulary.                                                                                                                                |
-| `model`     | `thinc.neural.Model` or `True` | The model powering the pipeline component. If no model is supplied, the model is created when you call `begin_training`, `from_disk` or `from_bytes`. |
-| `**cfg`     | -                              | Configuration parameters.                                                                                                                             |
-| **RETURNS** | `Tagger`                       | The newly constructed object.                                                                                                                         |
+| Name        | Type                          | Description                                                                                                                                           |
+| ----------- | ----------------------------- | ----------------------------------------------------------------------------------------------------------------------------------------------------- |
+| `vocab`     | `Vocab`                       | The shared vocabulary.                                                                                                                                |
+| `model`     | `thinc.neural.Model` / `True` | The model powering the pipeline component. If no model is supplied, the model is created when you call `begin_training`, `from_disk` or `from_bytes`. |
+| `**cfg`     | -                             | Configuration parameters.                                                                                                                             |
+| **RETURNS** | `Tagger`                      | The newly constructed object.                                                                                                                         |

 ## Tagger.\_\_call\_\_ {#call tag="method"}

 Apply the pipe to one document. The document is modified in place, and returned.
-Both [`__call__`](/api/tagger#call) and [`pipe`](/api/tagger#pipe) delegate to
-the [`predict`](/api/tagger#predict) and
+This usually happens under the hood when you call the `nlp` object on a text and
+all pipeline components are applied to the `Doc` in order. Both
+[`__call__`](/api/tagger#call) and [`pipe`](/api/tagger#pipe) delegate to the
+[`predict`](/api/tagger#predict) and
 [`set_annotations`](/api/tagger#set_annotations) methods.

 > #### Example
@ -56,6 +58,7 @@ the [`predict`](/api/tagger#predict) and
 > ```python
 > tagger = Tagger(nlp.vocab)
 > doc = nlp(u"This is a sentence.")
+> # This usually happens under the hood
 > processed = tagger(doc)
 > ```

@ -79,11 +82,11 @@ Apply the pipe to a stream of documents. Both [`__call__`](/api/tagger#call) and
 >     pass
 > ```

-| Name         | Type     | Description                                                                                                    |
-| ------------ | -------- | -------------------------------------------------------------------------------------------------------------- |
-| `stream`     | iterable | A stream of documents.                                                                                         |
-| `batch_size` | int      | The number of texts to buffer. Defaults to `128`.                                                              |
-| **YIELDS**   | `Doc`    | Processed documents in the order of the original text.                                                         |
+| Name         | Type     | Description                                            |
+| ------------ | -------- | ------------------------------------------------------ |
+| `stream`     | iterable | A stream of documents.                                 |
+| `batch_size` | int      | The number of texts to buffer. Defaults to `128`.      |
+| **YIELDS**   | `Doc`    | Processed documents in the order of the original text. |

 ## Tagger.predict {#predict tag="method"}

--- a/website/docs/api/textcategorizer.md
+++ b/website/docs/api/textcategorizer.md
@ -48,9 +48,10 @@ shortcut for this and instantiate the component using its string name and
 ## TextCategorizer.\_\_call\_\_ {#call tag="method"}

 Apply the pipe to one document. The document is modified in place, and returned.
-Both [`__call__`](/api/textcategorizer#call) and
-[`pipe`](/api/textcategorizer#pipe) delegate to the
-[`predict`](/api/textcategorizer#predict) and
+This usually happens under the hood when you call the `nlp` object on a text and
+all pipeline components are applied to the `Doc` in order. Both
+[`__call__`](/api/textcategorizer#call) and [`pipe`](/api/textcategorizer#pipe)
+delegate to the [`predict`](/api/textcategorizer#predict) and
 [`set_annotations`](/api/textcategorizer#set_annotations) methods.

 > #### Example
@ -58,6 +59,7 @@ Both [`__call__`](/api/textcategorizer#call) and
 > ```python
 > textcat = TextCategorizer(nlp.vocab)
 > doc = nlp(u"This is a sentence.")
+> # This usually happens under the hood
 > processed = textcat(doc)
 > ```