mirror of
https://github.com/explosion/spaCy.git
synced 2025-01-26 09:14:32 +03:00
remove component.Model, update constructor, losses is return value of update
This commit is contained in:
parent
2298e129e6
commit
90b100c39f
|
@ -8,35 +8,28 @@ This class is a subclass of `Pipe` and follows the same API. The pipeline
|
|||
component is available in the [processing pipeline](/usage/processing-pipelines)
|
||||
via the ID `"parser"`.
|
||||
|
||||
## DependencyParser.Model {#model tag="classmethod"}
|
||||
|
||||
Initialize a model for the pipe. The model should implement the
|
||||
`thinc.neural.Model` API. Wrappers are under development for most major machine
|
||||
learning libraries.
|
||||
|
||||
| Name | Type | Description |
|
||||
| ----------- | ------ | ------------------------------------- |
|
||||
| `**kwargs` | - | Parameters for initializing the model |
|
||||
| **RETURNS** | object | The initialized model. |
|
||||
|
||||
## DependencyParser.\_\_init\_\_ {#init tag="method"}
|
||||
|
||||
Create a new pipeline instance. In your application, you would normally use a
|
||||
shortcut for this and instantiate the component using its string name and
|
||||
[`nlp.create_pipe`](/api/language#create_pipe).
|
||||
|
||||
> #### Example
|
||||
>
|
||||
> ```python
|
||||
> # Construction via create_pipe
|
||||
> # Construction via create_pipe with default model
|
||||
> parser = nlp.create_pipe("parser")
|
||||
>
|
||||
> # Construction via create_pipe with custom model
|
||||
> config = {"model": {"@architectures": "my_parser"}}
|
||||
> parser = nlp.create_pipe("parser", config)
|
||||
>
|
||||
> # Construction from class
|
||||
> # Construction from class with custom model from file
|
||||
> from spacy.pipeline import DependencyParser
|
||||
> parser = DependencyParser(nlp.vocab, parser_model)
|
||||
> parser.from_disk("/path/to/model")
|
||||
> model = util.load_config("model.cfg", create_objects=True)["model"]
|
||||
> parser = DependencyParser(nlp.vocab, model)
|
||||
> ```
|
||||
|
||||
Create a new pipeline instance. In your application, you would normally use a
|
||||
shortcut for this and instantiate the component using its string name and
|
||||
[`nlp.create_pipe`](/api/language#create_pipe).
|
||||
|
||||
| Name | Type | Description |
|
||||
| ----------- | ------------------ | ------------------------------------------------------------------------------- |
|
||||
| `vocab` | `Vocab` | The shared vocabulary. |
|
||||
|
@ -85,11 +78,11 @@ applied to the `Doc` in order. Both [`__call__`](/api/dependencyparser#call) and
|
|||
> pass
|
||||
> ```
|
||||
|
||||
| Name | Type | Description |
|
||||
| ------------ | -------- | ------------------------------------------------------ |
|
||||
| `stream` | iterable | A stream of documents. |
|
||||
| `batch_size` | int | The number of texts to buffer. Defaults to `128`. |
|
||||
| **YIELDS** | `Doc` | Processed documents in the order of the original text. |
|
||||
| Name | Type | Description |
|
||||
| ------------ | --------------- | ------------------------------------------------------ |
|
||||
| `stream` | `Iterable[Doc]` | A stream of documents. |
|
||||
| `batch_size` | int | The number of texts to buffer. Defaults to `128`. |
|
||||
| **YIELDS** | `Doc` | Processed documents in the order of the original text. |
|
||||
|
||||
## DependencyParser.predict {#predict tag="method"}
|
||||
|
||||
|
@ -104,7 +97,7 @@ Apply the pipeline's model to a batch of docs, without modifying them.
|
|||
|
||||
| Name | Type | Description |
|
||||
| ----------- | ------------------- | ---------------------------------------------- |
|
||||
| `docs` | iterable | The documents to predict. |
|
||||
| `docs` | `Iterable[Doc]` | The documents to predict. |
|
||||
| **RETURNS** | `syntax.StateClass` | A helper class for the parse state (internal). |
|
||||
|
||||
## DependencyParser.set_annotations {#set_annotations tag="method"}
|
||||
|
@ -134,9 +127,8 @@ model. Delegates to [`predict`](/api/dependencyparser#predict) and
|
|||
>
|
||||
> ```python
|
||||
> parser = DependencyParser(nlp.vocab, parser_model)
|
||||
> losses = {}
|
||||
> optimizer = nlp.begin_training()
|
||||
> parser.update(examples, losses=losses, sgd=optimizer)
|
||||
> losses = parser.update(examples, sgd=optimizer)
|
||||
> ```
|
||||
|
||||
| Name | Type | Description |
|
||||
|
|
|
@ -12,36 +12,28 @@ This class is a subclass of `Pipe` and follows the same API. The pipeline
|
|||
component is available in the [processing pipeline](/usage/processing-pipelines)
|
||||
via the ID `"entity_linker"`.
|
||||
|
||||
## EntityLinker.Model {#model tag="classmethod"}
|
||||
|
||||
Initialize a model for the pipe. The model should implement the
|
||||
`thinc.neural.Model` API, and should contain a field `tok2vec` that contains the
|
||||
context encoder. Wrappers are under development for most major machine learning
|
||||
libraries.
|
||||
|
||||
| Name | Type | Description |
|
||||
| ----------- | ------ | ------------------------------------- |
|
||||
| `**kwargs` | - | Parameters for initializing the model |
|
||||
| **RETURNS** | object | The initialized model. |
|
||||
|
||||
## EntityLinker.\_\_init\_\_ {#init tag="method"}
|
||||
|
||||
Create a new pipeline instance. In your application, you would normally use a
|
||||
shortcut for this and instantiate the component using its string name and
|
||||
[`nlp.create_pipe`](/api/language#create_pipe).
|
||||
|
||||
> #### Example
|
||||
>
|
||||
> ```python
|
||||
> # Construction via create_pipe
|
||||
> # Construction via create_pipe with default model
|
||||
> entity_linker = nlp.create_pipe("entity_linker")
|
||||
>
|
||||
> # Construction from class
|
||||
> # Construction via create_pipe with custom model
|
||||
> config = {"model": {"@architectures": "my_el"}}
|
||||
> entity_linker = nlp.create_pipe("entity_linker", config)
|
||||
>
|
||||
> # Construction from class with custom model from file
|
||||
> from spacy.pipeline import EntityLinker
|
||||
> entity_linker = EntityLinker(nlp.vocab, nel_model)
|
||||
> entity_linker.from_disk("/path/to/model")
|
||||
> model = util.load_config("model.cfg", create_objects=True)["model"]
|
||||
> entity_linker = EntityLinker(nlp.vocab, model)
|
||||
> ```
|
||||
|
||||
Create a new pipeline instance. In your application, you would normally use a
|
||||
shortcut for this and instantiate the component using its string name and
|
||||
[`nlp.create_pipe`](/api/language#create_pipe).
|
||||
|
||||
| Name | Type | Description |
|
||||
| ------- | ------- | ------------------------------------------------------------------------------- |
|
||||
| `vocab` | `Vocab` | The shared vocabulary. |
|
||||
|
@ -90,11 +82,11 @@ applied to the `Doc` in order. Both [`__call__`](/api/entitylinker#call) and
|
|||
> pass
|
||||
> ```
|
||||
|
||||
| Name | Type | Description |
|
||||
| ------------ | -------- | ------------------------------------------------------ |
|
||||
| `stream` | iterable | A stream of documents. |
|
||||
| `batch_size` | int | The number of texts to buffer. Defaults to `128`. |
|
||||
| **YIELDS** | `Doc` | Processed documents in the order of the original text. |
|
||||
| Name | Type | Description |
|
||||
| ------------ | --------------- | ------------------------------------------------------ |
|
||||
| `stream` | `Iterable[Doc]` | A stream of documents. |
|
||||
| `batch_size` | int | The number of texts to buffer. Defaults to `128`. |
|
||||
| **YIELDS** | `Doc` | Processed documents in the order of the original text. |
|
||||
|
||||
## EntityLinker.predict {#predict tag="method"}
|
||||
|
||||
|
@ -142,9 +134,8 @@ pipe's entity linking model and context encoder. Delegates to
|
|||
>
|
||||
> ```python
|
||||
> entity_linker = EntityLinker(nlp.vocab, nel_model)
|
||||
> losses = {}
|
||||
> optimizer = nlp.begin_training()
|
||||
> entity_linker.update(examples, losses=losses, sgd=optimizer)
|
||||
> losses = entity_linker.update(examples, sgd=optimizer)
|
||||
> ```
|
||||
|
||||
| Name | Type | Description |
|
||||
|
@ -155,7 +146,7 @@ pipe's entity linking model and context encoder. Delegates to
|
|||
| `set_annotations` | bool | Whether or not to update the `Example` objects with the predictions, delegating to [`set_annotations`](/api/entitylinker#set_annotations). |
|
||||
| `sgd` | `Optimizer` | [`Optimizer`](https://thinc.ai/docs/api-optimizers) object. |
|
||||
| `losses` | `Dict[str, float]` | Optional record of the loss during training. The value keyed by the model's name is updated. |
|
||||
| **RETURNS** | float | The loss from this batch. |
|
||||
| **RETURNS** | `Dict[str, float]` | The updated `losses` dictionary. |
|
||||
|
||||
## EntityLinker.get_loss {#get_loss tag="method"}
|
||||
|
||||
|
|
|
@ -8,35 +8,28 @@ This class is a subclass of `Pipe` and follows the same API. The pipeline
|
|||
component is available in the [processing pipeline](/usage/processing-pipelines)
|
||||
via the ID `"ner"`.
|
||||
|
||||
## EntityRecognizer.Model {#model tag="classmethod"}
|
||||
|
||||
Initialize a model for the pipe. The model should implement the
|
||||
`thinc.neural.Model` API. Wrappers are under development for most major machine
|
||||
learning libraries.
|
||||
|
||||
| Name | Type | Description |
|
||||
| ----------- | ------ | ------------------------------------- |
|
||||
| `**kwargs` | - | Parameters for initializing the model |
|
||||
| **RETURNS** | object | The initialized model. |
|
||||
|
||||
## EntityRecognizer.\_\_init\_\_ {#init tag="method"}
|
||||
|
||||
Create a new pipeline instance. In your application, you would normally use a
|
||||
shortcut for this and instantiate the component using its string name and
|
||||
[`nlp.create_pipe`](/api/language#create_pipe).
|
||||
|
||||
> #### Example
|
||||
>
|
||||
> ```python
|
||||
> # Construction via create_pipe
|
||||
> ner = nlp.create_pipe("ner")
|
||||
>
|
||||
> # Construction via create_pipe with custom model
|
||||
> config = {"model": {"@architectures": "my_ner"}}
|
||||
> parser = nlp.create_pipe("ner", config)
|
||||
>
|
||||
> # Construction from class
|
||||
> # Construction from class with custom model from file
|
||||
> from spacy.pipeline import EntityRecognizer
|
||||
> ner = EntityRecognizer(nlp.vocab, ner_model)
|
||||
> ner.from_disk("/path/to/model")
|
||||
> model = util.load_config("model.cfg", create_objects=True)["model"]
|
||||
> ner = EntityRecognizer(nlp.vocab, model)
|
||||
> ```
|
||||
|
||||
Create a new pipeline instance. In your application, you would normally use a
|
||||
shortcut for this and instantiate the component using its string name and
|
||||
[`nlp.create_pipe`](/api/language#create_pipe).
|
||||
|
||||
| Name | Type | Description |
|
||||
| ----------- | ------------------ | ------------------------------------------------------------------------------- |
|
||||
| `vocab` | `Vocab` | The shared vocabulary. |
|
||||
|
@ -85,11 +78,11 @@ applied to the `Doc` in order. Both [`__call__`](/api/entityrecognizer#call) and
|
|||
> pass
|
||||
> ```
|
||||
|
||||
| Name | Type | Description |
|
||||
| ------------ | -------- | ------------------------------------------------------ |
|
||||
| `stream` | iterable | A stream of documents. |
|
||||
| `batch_size` | int | The number of texts to buffer. Defaults to `128`. |
|
||||
| **YIELDS** | `Doc` | Processed documents in the order of the original text. |
|
||||
| Name | Type | Description |
|
||||
| ------------ | --------------- | ------------------------------------------------------ |
|
||||
| `stream` | `Iterable[Doc]` | A stream of documents. |
|
||||
| `batch_size` | int | The number of texts to buffer. Defaults to `128`. |
|
||||
| **YIELDS** | `Doc` | Processed documents in the order of the original text. |
|
||||
|
||||
## EntityRecognizer.predict {#predict tag="method"}
|
||||
|
||||
|
@ -135,9 +128,8 @@ model. Delegates to [`predict`](/api/entityrecognizer#predict) and
|
|||
>
|
||||
> ```python
|
||||
> ner = EntityRecognizer(nlp.vocab, ner_model)
|
||||
> losses = {}
|
||||
> optimizer = nlp.begin_training()
|
||||
> ner.update(examples, losses=losses, sgd=optimizer)
|
||||
> losses = ner.update(examples, sgd=optimizer)
|
||||
> ```
|
||||
|
||||
| Name | Type | Description |
|
||||
|
|
|
@ -68,15 +68,15 @@ more efficient than processing texts one-by-one.
|
|||
> assert doc.is_parsed
|
||||
> ```
|
||||
|
||||
| Name | Type | Description |
|
||||
| -------------------------------------------- | -------- | ---------------------------------------------------------------------------------------------------------------------------------------------------------- |
|
||||
| `texts` | iterable | A sequence of strings. |
|
||||
| `as_tuples` | bool | If set to `True`, inputs should be a sequence of `(text, context)` tuples. Output will then be a sequence of `(doc, context)` tuples. Defaults to `False`. |
|
||||
| `batch_size` | int | The number of texts to buffer. |
|
||||
| `disable` | list | Names of pipeline components to [disable](/usage/processing-pipelines#disabling). |
|
||||
| `component_cfg` <Tag variant="new">2.1</Tag> | dict | Config parameters for specific pipeline components, keyed by component name. |
|
||||
| `n_process` <Tag variant="new">2.2.2</Tag> | int | Number of processors to use, only supported in Python 3. Defaults to `1`. |
|
||||
| **YIELDS** | `Doc` | Documents in the order of the original text. |
|
||||
| Name | Type | Description |
|
||||
| -------------------------------------------- | ----------------- | ---------------------------------------------------------------------------------------------------------------------------------------------------------- |
|
||||
| `texts` | `Iterable[str]` | A sequence of strings. |
|
||||
| `as_tuples` | bool | If set to `True`, inputs should be a sequence of `(text, context)` tuples. Output will then be a sequence of `(doc, context)` tuples. Defaults to `False`. |
|
||||
| `batch_size` | int | The number of texts to buffer. |
|
||||
| `disable` | `List[str]` | Names of pipeline components to [disable](/usage/processing-pipelines#disabling). |
|
||||
| `component_cfg` <Tag variant="new">2.1</Tag> | `Dict[str, Dict]` | Config parameters for specific pipeline components, keyed by component name. |
|
||||
| `n_process` <Tag variant="new">2.2.2</Tag> | int | Number of processors to use, only supported in Python 3. Defaults to `1`. |
|
||||
| **YIELDS** | `Doc` | Documents in the order of the original text. |
|
||||
|
||||
## Language.update {#update tag="method"}
|
||||
|
||||
|
@ -99,6 +99,7 @@ Update the models in the pipeline.
|
|||
| `sgd` | `Optimizer` | An [`Optimizer`](https://thinc.ai/docs/api-optimizers) object. |
|
||||
| `losses` | `Dict[str, float]` | Dictionary to update with the loss, keyed by pipeline component. |
|
||||
| `component_cfg` <Tag variant="new">2.1</Tag> | `Dict[str, Dict]` | Config parameters for specific pipeline components, keyed by component name. |
|
||||
| **RETURNS** | `Dict[str, float]` | The updated `losses` dictionary. |
|
||||
|
||||
## Language.evaluate {#evaluate tag="method"}
|
||||
|
||||
|
|
|
@ -8,35 +8,28 @@ This class is a subclass of `Pipe` and follows the same API. The pipeline
|
|||
component is available in the [processing pipeline](/usage/processing-pipelines)
|
||||
via the ID `"tagger"`.
|
||||
|
||||
## Tagger.Model {#model tag="classmethod"}
|
||||
|
||||
Initialize a model for the pipe. The model should implement the
|
||||
`thinc.neural.Model` API. Wrappers are under development for most major machine
|
||||
learning libraries.
|
||||
|
||||
| Name | Type | Description |
|
||||
| ----------- | ------ | ------------------------------------- |
|
||||
| `**kwargs` | - | Parameters for initializing the model |
|
||||
| **RETURNS** | object | The initialized model. |
|
||||
|
||||
## Tagger.\_\_init\_\_ {#init tag="method"}
|
||||
|
||||
Create a new pipeline instance. In your application, you would normally use a
|
||||
shortcut for this and instantiate the component using its string name and
|
||||
[`nlp.create_pipe`](/api/language#create_pipe).
|
||||
|
||||
> #### Example
|
||||
>
|
||||
> ```python
|
||||
> # Construction via create_pipe
|
||||
> tagger = nlp.create_pipe("tagger")
|
||||
>
|
||||
> # Construction via create_pipe with custom model
|
||||
> config = {"model": {"@architectures": "my_tagger"}}
|
||||
> parser = nlp.create_pipe("tagger", config)
|
||||
>
|
||||
> # Construction from class
|
||||
> # Construction from class with custom model from file
|
||||
> from spacy.pipeline import Tagger
|
||||
> tagger = Tagger(nlp.vocab, tagger_model)
|
||||
> tagger.from_disk("/path/to/model")
|
||||
> model = util.load_config("model.cfg", create_objects=True)["model"]
|
||||
> tagger = Tagger(nlp.vocab, model)
|
||||
> ```
|
||||
|
||||
Create a new pipeline instance. In your application, you would normally use a
|
||||
shortcut for this and instantiate the component using its string name and
|
||||
[`nlp.create_pipe`](/api/language#create_pipe).
|
||||
|
||||
| Name | Type | Description |
|
||||
| ----------- | -------- | ------------------------------------------------------------------------------- |
|
||||
| `vocab` | `Vocab` | The shared vocabulary. |
|
||||
|
@ -83,11 +76,11 @@ applied to the `Doc` in order. Both [`__call__`](/api/tagger#call) and
|
|||
> pass
|
||||
> ```
|
||||
|
||||
| Name | Type | Description |
|
||||
| ------------ | -------- | ------------------------------------------------------ |
|
||||
| `stream` | iterable | A stream of documents. |
|
||||
| `batch_size` | int | The number of texts to buffer. Defaults to `128`. |
|
||||
| **YIELDS** | `Doc` | Processed documents in the order of the original text. |
|
||||
| Name | Type | Description |
|
||||
| ------------ | --------------- | ------------------------------------------------------ |
|
||||
| `stream` | `Iterable[Doc]` | A stream of documents. |
|
||||
| `batch_size` | int | The number of texts to buffer. Defaults to `128`. |
|
||||
| **YIELDS** | `Doc` | Processed documents in the order of the original text. |
|
||||
|
||||
## Tagger.predict {#predict tag="method"}
|
||||
|
||||
|
@ -133,9 +126,8 @@ pipe's model. Delegates to [`predict`](/api/tagger#predict) and
|
|||
>
|
||||
> ```python
|
||||
> tagger = Tagger(nlp.vocab, tagger_model)
|
||||
> losses = {}
|
||||
> optimizer = nlp.begin_training()
|
||||
> tagger.update(examples, losses=losses, sgd=optimizer)
|
||||
> losses = tagger.update(examples, sgd=optimizer)
|
||||
> ```
|
||||
|
||||
| Name | Type | Description |
|
||||
|
@ -146,6 +138,7 @@ pipe's model. Delegates to [`predict`](/api/tagger#predict) and
|
|||
| `set_annotations` | bool | Whether or not to update the `Example` objects with the predictions, delegating to [`set_annotations`](/api/tagger#set_annotations). |
|
||||
| `sgd` | `Optimizer` | The [`Optimizer`](https://thinc.ai/docs/api-optimizers) object. |
|
||||
| `losses` | `Dict[str, float]` | Optional record of the loss during training. The value keyed by the model's name is updated. |
|
||||
| **RETURNS** | `Dict[str, float]` | The updated `losses` dictionary. |
|
||||
|
||||
## Tagger.get_loss {#get_loss tag="method"}
|
||||
|
||||
|
|
|
@ -9,36 +9,28 @@ This class is a subclass of `Pipe` and follows the same API. The pipeline
|
|||
component is available in the [processing pipeline](/usage/processing-pipelines)
|
||||
via the ID `"textcat"`.
|
||||
|
||||
## TextCategorizer.Model {#model tag="classmethod"}
|
||||
|
||||
Initialize a model for the pipe. The model should implement the
|
||||
`thinc.neural.Model` API. Wrappers are under development for most major machine
|
||||
learning libraries.
|
||||
|
||||
| Name | Type | Description |
|
||||
| ----------- | ------ | ------------------------------------- |
|
||||
| `**kwargs` | - | Parameters for initializing the model |
|
||||
| **RETURNS** | object | The initialized model. |
|
||||
|
||||
## TextCategorizer.\_\_init\_\_ {#init tag="method"}
|
||||
|
||||
Create a new pipeline instance. In your application, you would normally use a
|
||||
shortcut for this and instantiate the component using its string name and
|
||||
[`nlp.create_pipe`](/api/language#create_pipe).
|
||||
|
||||
> #### Example
|
||||
>
|
||||
> ```python
|
||||
> # Construction via create_pipe
|
||||
> textcat = nlp.create_pipe("textcat")
|
||||
> textcat = nlp.create_pipe("textcat", config={"exclusive_classes": True})
|
||||
>
|
||||
> # Construction from class
|
||||
>
|
||||
> # Construction via create_pipe with custom model
|
||||
> config = {"model": {"@architectures": "my_textcat"}}
|
||||
> parser = nlp.create_pipe("textcat", config)
|
||||
>
|
||||
> # Construction from class with custom model from file
|
||||
> from spacy.pipeline import TextCategorizer
|
||||
> textcat = TextCategorizer(nlp.vocab, textcat_model)
|
||||
> textcat.from_disk("/path/to/model")
|
||||
> model = util.load_config("model.cfg", create_objects=True)["model"]
|
||||
> textcat = TextCategorizer(nlp.vocab, model)
|
||||
> ```
|
||||
|
||||
Create a new pipeline instance. In your application, you would normally use a
|
||||
shortcut for this and instantiate the component using its string name and
|
||||
[`nlp.create_pipe`](/api/language#create_pipe).
|
||||
|
||||
| Name | Type | Description |
|
||||
| ----------- | ----------------- | ------------------------------------------------------------------------------- |
|
||||
| `vocab` | `Vocab` | The shared vocabulary. |
|
||||
|
@ -46,6 +38,7 @@ shortcut for this and instantiate the component using its string name and
|
|||
| `**cfg` | - | Configuration parameters. |
|
||||
| **RETURNS** | `TextCategorizer` | The newly constructed object. |
|
||||
|
||||
<!-- TODO move to config page
|
||||
### Architectures {#architectures new="2.1"}
|
||||
|
||||
Text classification models can be used to solve a wide variety of problems.
|
||||
|
@ -60,6 +53,7 @@ argument.
|
|||
| `"ensemble"` | **Default:** Stacked ensemble of a bag-of-words model and a neural network model. The neural network uses a CNN with mean pooling and attention. The "ngram_size" and "attr" arguments can be used to configure the feature extraction for the bag-of-words model. |
|
||||
| `"simple_cnn"` | A neural network model where token vectors are calculated using a CNN. The vectors are mean pooled and used as features in a feed-forward network. This architecture is usually less accurate than the ensemble, but runs faster. |
|
||||
| `"bow"` | An ngram "bag-of-words" model. This architecture should run much faster than the others, but may not be as accurate, especially if texts are short. The features extracted can be controlled using the keyword arguments `ngram_size` and `attr`. For instance, `ngram_size=3` and `attr="lower"` would give lower-cased unigram, trigram and bigram features. 2, 3 or 4 are usually good choices of ngram size. |
|
||||
-->
|
||||
|
||||
## TextCategorizer.\_\_call\_\_ {#call tag="method"}
|
||||
|
||||
|
@ -101,11 +95,11 @@ applied to the `Doc` in order. Both [`__call__`](/api/textcategorizer#call) and
|
|||
> pass
|
||||
> ```
|
||||
|
||||
| Name | Type | Description |
|
||||
| ------------ | -------- | ------------------------------------------------------ |
|
||||
| `stream` | iterable | A stream of documents. |
|
||||
| `batch_size` | int | The number of texts to buffer. Defaults to `128`. |
|
||||
| **YIELDS** | `Doc` | Processed documents in the order of the original text. |
|
||||
| Name | Type | Description |
|
||||
| ------------ | --------------- | ------------------------------------------------------ |
|
||||
| `stream` | `Iterable[Doc]` | A stream of documents. |
|
||||
| `batch_size` | int | The number of texts to buffer. Defaults to `128`. |
|
||||
| **YIELDS** | `Doc` | Processed documents in the order of the original text. |
|
||||
|
||||
## TextCategorizer.predict {#predict tag="method"}
|
||||
|
||||
|
@ -151,9 +145,8 @@ pipe's model. Delegates to [`predict`](/api/textcategorizer#predict) and
|
|||
>
|
||||
> ```python
|
||||
> textcat = TextCategorizer(nlp.vocab, textcat_model)
|
||||
> losses = {}
|
||||
> optimizer = nlp.begin_training()
|
||||
> textcat.update(examples, losses=losses, sgd=optimizer)
|
||||
> losses = textcat.update(examples, sgd=optimizer)
|
||||
> ```
|
||||
|
||||
| Name | Type | Description |
|
||||
|
@ -164,6 +157,7 @@ pipe's model. Delegates to [`predict`](/api/textcategorizer#predict) and
|
|||
| `set_annotations` | bool | Whether or not to update the `Example` objects with the predictions, delegating to [`set_annotations`](/api/textcategorizer#set_annotations). |
|
||||
| `sgd` | `Optimizer` | The [`Optimizer`](https://thinc.ai/docs/api-optimizers) object. |
|
||||
| `losses` | `Dict[str, float]` | Optional record of the loss during training. The value keyed by the model's name is updated. |
|
||||
| **RETURNS** | `Dict[str, float]` | The updated `losses` dictionary. |
|
||||
|
||||
## TextCategorizer.get_loss {#get_loss tag="method"}
|
||||
|
||||
|
|
Loading…
Reference in New Issue
Block a user