diff --git a/website/docs/api/dependencyparser.md b/website/docs/api/dependencyparser.md index 0e493e600..d52cad2c8 100644 --- a/website/docs/api/dependencyparser.md +++ b/website/docs/api/dependencyparser.md @@ -15,7 +15,7 @@ via the ID `"parser"`. > ```python > # Construction via create_pipe with default model > parser = nlp.create_pipe("parser") -> +> > # Construction via create_pipe with custom model > config = {"model": {"@architectures": "my_parser"}} > parser = nlp.create_pipe("parser", config) @@ -112,10 +112,10 @@ Modify a batch of documents, using pre-computed scores. > parser.set_annotations([doc1, doc2], scores) > ``` -| Name | Type | Description | -| -------- | -------- | ---------------------------------------------------------- | -| `docs` | iterable | The documents to modify. | -| `scores` | - | The scores to set, produced by `DependencyParser.predict`. | +| Name | Type | Description | +| -------- | ------------------- | ---------------------------------------------------------- | +| `docs` | `Iterable[Doc]` | The documents to modify. | +| `scores` | `syntax.StateClass` | The scores to set, produced by `DependencyParser.predict`. | ## DependencyParser.update {#update tag="method"} @@ -150,16 +150,15 @@ predicted scores. > > ```python > parser = DependencyParser(nlp.vocab) -> scores = parser.predict([doc1, doc2]) -> loss, d_loss = parser.get_loss([doc1, doc2], [gold1, gold2], scores) +> scores = parser.predict([eg.predicted for eg in examples]) +> loss, d_loss = parser.get_loss(examples, scores) > ``` -| Name | Type | Description | -| ----------- | -------- | ------------------------------------------------------------ | -| `docs` | iterable | The batch of documents. | -| `golds` | iterable | The gold-standard data. Must have the same length as `docs`. | -| `scores` | - | Scores representing the model's predictions. | -| **RETURNS** | tuple | The loss and the gradient, i.e. `(loss, gradient)`. | +| Name | Type | Description | +| ----------- | ------------------- | --------------------------------------------------- | +| `examples` | `Iterable[Example]` | The batch of examples. | +| `scores` | `syntax.StateClass` | Scores representing the model's predictions. | +| **RETURNS** | tuple | The loss and the gradient, i.e. `(loss, gradient)`. | ## DependencyParser.begin_training {#begin_training tag="method"} @@ -193,9 +192,9 @@ component. > optimizer = parser.create_optimizer() > ``` -| Name | Type | Description | -| ----------- | ----------- | -------------- | -| **RETURNS** | `Optimizer` | The optimizer. | +| Name | Type | Description | +| ----------- | ----------- | --------------------------------------------------------------- | +| **RETURNS** | `Optimizer` | The [`Optimizer`](https://thinc.ai/docs/api-optimizers) object. | ## DependencyParser.use_params {#use_params tag="method, contextmanager"} diff --git a/website/docs/api/entitylinker.md b/website/docs/api/entitylinker.md index 754c2fc33..ca0a0b34c 100644 --- a/website/docs/api/entitylinker.md +++ b/website/docs/api/entitylinker.md @@ -96,13 +96,13 @@ Apply the pipeline's model to a batch of docs, without modifying them. > > ```python > entity_linker = EntityLinker(nlp.vocab) -> kb_ids, tensors = entity_linker.predict([doc1, doc2]) +> kb_ids = entity_linker.predict([doc1, doc2]) > ``` -| Name | Type | Description | -| ----------- | -------- | -------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | -| `docs` | iterable | The documents to predict. | -| **RETURNS** | tuple | A `(kb_ids, tensors)` tuple where `kb_ids` are the model's predicted KB identifiers for the entities in the `docs`, and `tensors` are the token representations used to predict these identifiers. | +| Name | Type | Description | +| ----------- | --------------- | ------------------------------------------------------------ | +| `docs` | `Iterable[Doc]` | The documents to predict. | +| **RETURNS** | `Iterable[str]` | The predicted KB identifiers for the entities in the `docs`. | ## EntityLinker.set_annotations {#set_annotations tag="method"} @@ -113,15 +113,14 @@ entities. > > ```python > entity_linker = EntityLinker(nlp.vocab) -> kb_ids, tensors = entity_linker.predict([doc1, doc2]) -> entity_linker.set_annotations([doc1, doc2], kb_ids, tensors) +> kb_ids = entity_linker.predict([doc1, doc2]) +> entity_linker.set_annotations([doc1, doc2], kb_ids) > ``` -| Name | Type | Description | -| --------- | -------- | ------------------------------------------------------------------------------------------------- | -| `docs` | iterable | The documents to modify. | -| `kb_ids` | iterable | The knowledge base identifiers for the entities in the docs, predicted by `EntityLinker.predict`. | -| `tensors` | iterable | The token representations used to predict the identifiers. | +| Name | Type | Description | +| -------- | --------------- | ------------------------------------------------------------------------------------------------- | +| `docs` | `Iterable[Doc]` | The documents to modify. | +| `kb_ids` | `Iterable[str]` | The knowledge base identifiers for the entities in the docs, predicted by `EntityLinker.predict`. | ## EntityLinker.update {#update tag="method"} @@ -148,27 +147,6 @@ pipe's entity linking model and context encoder. Delegates to | `losses` | `Dict[str, float]` | Optional record of the loss during training. The value keyed by the model's name is updated. | | **RETURNS** | `Dict[str, float]` | The updated `losses` dictionary. | -## EntityLinker.get_loss {#get_loss tag="method"} - -Find the loss and gradient of loss for the entities in a batch of documents and -their predicted scores. - -> #### Example -> -> ```python -> entity_linker = EntityLinker(nlp.vocab) -> kb_ids, tensors = entity_linker.predict(docs) -> loss, d_loss = entity_linker.get_loss(docs, [gold1, gold2], kb_ids, tensors) -> ``` - -| Name | Type | Description | -| ----------- | -------- | ------------------------------------------------------------ | -| `docs` | iterable | The batch of documents. | -| `golds` | iterable | The gold-standard data. Must have the same length as `docs`. | -| `kb_ids` | iterable | KB identifiers representing the model's predictions. | -| `tensors` | iterable | The token representations used to predict the identifiers | -| **RETURNS** | tuple | The loss and the gradient, i.e. `(loss, gradient)`. | - ## EntityLinker.set_kb {#set_kb tag="method"} Define the knowledge base (KB) used for disambiguating named entities to KB @@ -219,9 +197,9 @@ Create an optimizer for the pipeline component. > optimizer = entity_linker.create_optimizer() > ``` -| Name | Type | Description | -| ----------- | -------- | -------------- | -| **RETURNS** | callable | The optimizer. | +| Name | Type | Description | +| ----------- | ----------- | --------------------------------------------------------------- | +| **RETURNS** | `Optimizer` | The [`Optimizer`](https://thinc.ai/docs/api-optimizers) object. | ## EntityLinker.use_params {#use_params tag="method, contextmanager"} diff --git a/website/docs/api/entityrecognizer.md b/website/docs/api/entityrecognizer.md index 5739afff4..75d6332f2 100644 --- a/website/docs/api/entityrecognizer.md +++ b/website/docs/api/entityrecognizer.md @@ -15,7 +15,7 @@ via the ID `"ner"`. > ```python > # Construction via create_pipe > ner = nlp.create_pipe("ner") -> +> > # Construction via create_pipe with custom model > config = {"model": {"@architectures": "my_ner"}} > parser = nlp.create_pipe("ner", config) @@ -92,13 +92,13 @@ Apply the pipeline's model to a batch of docs, without modifying them. > > ```python > ner = EntityRecognizer(nlp.vocab) -> scores, tensors = ner.predict([doc1, doc2]) +> scores = ner.predict([doc1, doc2]) > ``` -| Name | Type | Description | -| ----------- | -------- | ---------------------------------------------------------------------------------------------------------- | -| `docs` | iterable | The documents to predict. | -| **RETURNS** | list | List of `syntax.StateClass` objects. `syntax.StateClass` is a helper class for the parse state (internal). | +| Name | Type | Description | +| ----------- | ------------------ | ---------------------------------------------------------------------------------------------------------- | +| `docs` | `Iterable[Doc]` | The documents to predict. | +| **RETURNS** | `List[StateClass]` | List of `syntax.StateClass` objects. `syntax.StateClass` is a helper class for the parse state (internal). | ## EntityRecognizer.set_annotations {#set_annotations tag="method"} @@ -108,15 +108,14 @@ Modify a batch of documents, using pre-computed scores. > > ```python > ner = EntityRecognizer(nlp.vocab) -> scores, tensors = ner.predict([doc1, doc2]) -> ner.set_annotations([doc1, doc2], scores, tensors) +> scores = ner.predict([doc1, doc2]) +> ner.set_annotations([doc1, doc2], scores) > ``` -| Name | Type | Description | -| --------- | -------- | ---------------------------------------------------------- | -| `docs` | iterable | The documents to modify. | -| `scores` | - | The scores to set, produced by `EntityRecognizer.predict`. | -| `tensors` | iterable | The token representations used to predict the scores. | +| Name | Type | Description | +| -------- | ------------------ | ---------------------------------------------------------- | +| `docs` | `Iterable[Doc]` | The documents to modify. | +| `scores` | `List[StateClass]` | The scores to set, produced by `EntityRecognizer.predict`. | ## EntityRecognizer.update {#update tag="method"} @@ -151,16 +150,15 @@ predicted scores. > > ```python > ner = EntityRecognizer(nlp.vocab) -> scores = ner.predict([doc1, doc2]) -> loss, d_loss = ner.get_loss([doc1, doc2], [gold1, gold2], scores) +> scores = ner.predict([eg.predicted for eg in examples]) +> loss, d_loss = ner.get_loss(examples, scores) > ``` -| Name | Type | Description | -| ----------- | -------- | ------------------------------------------------------------ | -| `docs` | iterable | The batch of documents. | -| `golds` | iterable | The gold-standard data. Must have the same length as `docs`. | -| `scores` | - | Scores representing the model's predictions. | -| **RETURNS** | tuple | The loss and the gradient, i.e. `(loss, gradient)`. | +| Name | Type | Description | +| ----------- | ------------------- | --------------------------------------------------- | +| `examples` | `Iterable[Example]` | The batch of examples. | +| `scores` | `List[StateClass]` | Scores representing the model's predictions. | +| **RETURNS** | tuple | The loss and the gradient, i.e. `(loss, gradient)`. | ## EntityRecognizer.begin_training {#begin_training tag="method"} @@ -182,8 +180,6 @@ Initialize the pipe for training, using data examples if available. Return an | `sgd` | `Optimizer` | An optional [`Optimizer`](https://thinc.ai/docs/api-optimizers) object. Will be created via [`create_optimizer`](/api/entityrecognizer#create_optimizer) if not set. | | **RETURNS** | `Optimizer` | An optimizer. | -| - ## EntityRecognizer.create_optimizer {#create_optimizer tag="method"} Create an optimizer for the pipeline component. @@ -195,9 +191,9 @@ Create an optimizer for the pipeline component. > optimizer = ner.create_optimizer() > ``` -| Name | Type | Description | -| ----------- | -------- | -------------- | -| **RETURNS** | callable | The optimizer. | +| Name | Type | Description | +| ----------- | ----------- | --------------------------------------------------------------- | +| **RETURNS** | `Optimizer` | The [`Optimizer`](https://thinc.ai/docs/api-optimizers) object. | ## EntityRecognizer.use_params {#use_params tag="method, contextmanager"} diff --git a/website/docs/api/language.md b/website/docs/api/language.md index c9cfd2f2d..3ba93b360 100644 --- a/website/docs/api/language.md +++ b/website/docs/api/language.md @@ -52,7 +52,7 @@ contain arbitrary whitespace. Alignment into the original string is preserved. | Name | Type | Description | | ----------- | ----- | --------------------------------------------------------------------------------- | | `text` | str | The text to be processed. | -| `disable` | list | Names of pipeline components to [disable](/usage/processing-pipelines#disabling). | +| `disable` | `List[str]` | Names of pipeline components to [disable](/usage/processing-pipelines#disabling). | | **RETURNS** | `Doc` | A container for accessing the annotations. | ## Language.pipe {#pipe tag="method"} diff --git a/website/docs/api/tagger.md b/website/docs/api/tagger.md index 5f625f842..9ef0843cf 100644 --- a/website/docs/api/tagger.md +++ b/website/docs/api/tagger.md @@ -15,7 +15,7 @@ via the ID `"tagger"`. > ```python > # Construction via create_pipe > tagger = nlp.create_pipe("tagger") -> +> > # Construction via create_pipe with custom model > config = {"model": {"@architectures": "my_tagger"}} > parser = nlp.create_pipe("tagger", config) @@ -90,13 +90,13 @@ Apply the pipeline's model to a batch of docs, without modifying them. > > ```python > tagger = Tagger(nlp.vocab) -> scores, tensors = tagger.predict([doc1, doc2]) +> scores = tagger.predict([doc1, doc2]) > ``` -| Name | Type | Description | -| ----------- | -------- | ---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | -| `docs` | iterable | The documents to predict. | -| **RETURNS** | tuple | A `(scores, tensors)` tuple where `scores` is the model's prediction for each document and `tensors` is the token representations used to predict the scores. Each tensor is an array with one row for each token in the document. | +| Name | Type | Description | +| ----------- | --------------- | ----------------------------------------- | +| `docs` | `Iterable[Doc]` | The documents to predict. | +| **RETURNS** | - | The model's prediction for each document. | ## Tagger.set_annotations {#set_annotations tag="method"} @@ -106,15 +106,14 @@ Modify a batch of documents, using pre-computed scores. > > ```python > tagger = Tagger(nlp.vocab) -> scores, tensors = tagger.predict([doc1, doc2]) -> tagger.set_annotations([doc1, doc2], scores, tensors) +> scores = tagger.predict([doc1, doc2]) +> tagger.set_annotations([doc1, doc2], scores) > ``` -| Name | Type | Description | -| --------- | -------- | ----------------------------------------------------- | -| `docs` | iterable | The documents to modify. | -| `scores` | - | The scores to set, produced by `Tagger.predict`. | -| `tensors` | iterable | The token representations used to predict the scores. | +| Name | Type | Description | +| -------- | --------------- | ------------------------------------------------ | +| `docs` | `Iterable[Doc]` | The documents to modify. | +| `scores` | - | The scores to set, produced by `Tagger.predict`. | ## Tagger.update {#update tag="method"} @@ -149,16 +148,15 @@ predicted scores. > > ```python > tagger = Tagger(nlp.vocab) -> scores = tagger.predict([doc1, doc2]) -> loss, d_loss = tagger.get_loss([doc1, doc2], [gold1, gold2], scores) +> scores = tagger.predict([eg.predicted for eg in examples]) +> loss, d_loss = tagger.get_loss(examples, scores) > ``` -| Name | Type | Description | -| ----------- | -------- | ------------------------------------------------------------ | -| `docs` | iterable | The batch of documents. | -| `golds` | iterable | The gold-standard data. Must have the same length as `docs`. | -| `scores` | - | Scores representing the model's predictions. | -| **RETURNS** | tuple | The loss and the gradient, i.e. `(loss, gradient)`. | +| Name | Type | Description | +| ----------- | ------------------- | --------------------------------------------------- | +| `examples` | `Iterable[Example]` | The batch of examples. | +| `scores` | - | Scores representing the model's predictions. | +| **RETURNS** | tuple | The loss and the gradient, i.e. `(loss, gradient)`. | ## Tagger.begin_training {#begin_training tag="method"} @@ -191,9 +189,9 @@ Create an optimizer for the pipeline component. > optimizer = tagger.create_optimizer() > ``` -| Name | Type | Description | -| ----------- | -------- | -------------- | -| **RETURNS** | callable | The optimizer. | +| Name | Type | Description | +| ----------- | ----------- | --------------------------------------------------------------- | +| **RETURNS** | `Optimizer` | The [`Optimizer`](https://thinc.ai/docs/api-optimizers) object. | ## Tagger.use_params {#use_params tag="method, contextmanager"} diff --git a/website/docs/api/textcategorizer.md b/website/docs/api/textcategorizer.md index ff9890dd6..08e922ba7 100644 --- a/website/docs/api/textcategorizer.md +++ b/website/docs/api/textcategorizer.md @@ -16,11 +16,11 @@ via the ID `"textcat"`. > ```python > # Construction via create_pipe > textcat = nlp.create_pipe("textcat") -> +> > # Construction via create_pipe with custom model > config = {"model": {"@architectures": "my_textcat"}} > parser = nlp.create_pipe("textcat", config) -> +> > # Construction from class with custom model from file > from spacy.pipeline import TextCategorizer > model = util.load_config("model.cfg", create_objects=True)["model"] @@ -38,7 +38,7 @@ shortcut for this and instantiate the component using its string name and | `**cfg` | - | Configuration parameters. | | **RETURNS** | `TextCategorizer` | The newly constructed object. | -