Update Scorer API docs for score_cats

This commit is contained in:
Adriane Boyd 2020-07-27 15:34:42 +02:00
parent 34c92dfe63
commit fdf09cb231

View File

@ -8,8 +8,8 @@ source: spacy/scorer.py
The `Scorer` computes evaluation scores. It's typically created by The `Scorer` computes evaluation scores. It's typically created by
[`Language.evaluate`](/api/language#evaluate). [`Language.evaluate`](/api/language#evaluate).
In addition, the `Scorer` provides a number of evaluation methods for In addition, the `Scorer` provides a number of evaluation methods for evaluating
evaluating `Token` and `Doc` attributes. `Token` and `Doc` attributes.
## Scorer.\_\_init\_\_ {#init tag="method"} ## Scorer.\_\_init\_\_ {#init tag="method"}
@ -29,7 +29,7 @@ Create a new `Scorer`.
> ``` > ```
| Name | Type | Description | | Name | Type | Description |
| ------------ | -------- | ------------------------------------------------------------ | | ----------- | -------- | ---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
| `nlp` | Language | The pipeline to use for scoring, where each pipeline component may provide a scoring method. If none is provided, then a default pipeline for the multi-language code `xx` is constructed containing: `senter`, `tagger`, `morphologizer`, `parser`, `ner`, `textcat`. | | `nlp` | Language | The pipeline to use for scoring, where each pipeline component may provide a scoring method. If none is provided, then a default pipeline for the multi-language code `xx` is constructed containing: `senter`, `tagger`, `morphologizer`, `parser`, `ner`, `textcat`. |
| **RETURNS** | `Scorer` | The newly created object. | | **RETURNS** | `Scorer` | The newly created object. |
@ -39,12 +39,12 @@ Calculate the scores for a list of [`Example`](/api/example) objects using the
scoring methods provided by the components in the pipeline. scoring methods provided by the components in the pipeline.
The returned `Dict` contains the scores provided by the individual pipeline The returned `Dict` contains the scores provided by the individual pipeline
components. For the scoring methods provided by the `Scorer` and use by the components. For the scoring methods provided by the `Scorer` and use by the core
core pipeline components, the individual score names start with the `Token` or pipeline components, the individual score names start with the `Token` or `Doc`
`Doc` attribute being scored: `token_acc`, `token_p/r/f`, `sents_p/r/f`, attribute being scored: `token_acc`, `token_p/r/f`, `sents_p/r/f`, `tag_acc`,
`tag_acc`, `pos_acc`, `morph_acc`, `morph_per_feat`, `lemma_acc`, `dep_uas`, `pos_acc`, `morph_acc`, `morph_per_feat`, `lemma_acc`, `dep_uas`, `dep_las`,
`dep_las`, `dep_las_per_type`, `ents_p/r/f`, `ents_per_type`, `dep_las_per_type`, `ents_p/r/f`, `ents_per_type`, `textcat_macro_auc`,
`textcat_macro_auc`, `textcat_macro_f`. `textcat_macro_f`.
> #### Example > #### Example
> >
@ -54,18 +54,19 @@ core pipeline components, the individual score names start with the `Token` or
> ``` > ```
| Name | Type | Description | | Name | Type | Description |
| ----------- | --------- | --------------------------------------------------------------------------------------------------------| | ----------- | ------------------- | --------------------------------------------------------------------------------------------- |
| `examples` | `Iterable[Example]` | The `Example` objects holding both the predictions and the correct gold-standard annotations. | | `examples` | `Iterable[Example]` | The `Example` objects holding both the predictions and the correct gold-standard annotations. |
| **RETURNS** | `Dict` | A dictionary of scores. | | **RETURNS** | `Dict` | A dictionary of scores. |
## Scorer.score_tokenization {#score_tokenization tag="staticmethod"} ## Scorer.score_tokenization {#score_tokenization tag="staticmethod"}
Scores the tokenization: Scores the tokenization:
* `token_acc`: # correct tokens / # gold tokens - `token_acc`: # correct tokens / # gold tokens
* `token_p/r/f`: PRF for token character spans - `token_p/r/f`: PRF for token character spans
| Name | Type | Description | | Name | Type | Description |
| ----------- | --------- | --------------------------------------------------------------------------------------------------------| | ----------- | ------------------- | --------------------------------------------------------------------------------------------- |
| `examples` | `Iterable[Example]` | The `Example` objects holding both the predictions and the correct gold-standard annotations. | | `examples` | `Iterable[Example]` | The `Example` objects holding both the predictions and the correct gold-standard annotations. |
| **RETURNS** | `Dict` | A dictionary containing the scores `token_acc/p/r/f`. | | **RETURNS** | `Dict` | A dictionary containing the scores `token_acc/p/r/f`. |
@ -74,7 +75,7 @@ Scores the tokenization:
Scores a single token attribute. Scores a single token attribute.
| Name | Type | Description | | Name | Type | Description |
| ----------- | --------- | --------------------------------------------------------------------------------------------------------| | ----------- | ------------------- | ----------------------------------------------------------------------------------------------------------------------------- |
| `examples` | `Iterable[Example]` | The `Example` objects holding both the predictions and the correct gold-standard annotations. | | `examples` | `Iterable[Example]` | The `Example` objects holding both the predictions and the correct gold-standard annotations. |
| `attr` | `str` | The attribute to score. | | `attr` | `str` | The attribute to score. |
| `getter` | `callable` | Defaults to `getattr`. If provided, `getter(token, attr)` should return the value of the attribute for an individual `Token`. | | `getter` | `callable` | Defaults to `getattr`. If provided, `getter(token, attr)` should return the value of the attribute for an individual `Token`. |
@ -82,10 +83,11 @@ Scores a single token attribute.
## Scorer.score_token_attr_per_feat {#score_token_attr_per_feat tag="staticmethod"} ## Scorer.score_token_attr_per_feat {#score_token_attr_per_feat tag="staticmethod"}
Scores a single token attribute per feature for a token attribute in UFEATS format. Scores a single token attribute per feature for a token attribute in UFEATS
format.
| Name | Type | Description | | Name | Type | Description |
| ----------- | --------- | --------------------------------------------------------------------------------------------------------| | ----------- | ------------------- | ----------------------------------------------------------------------------------------------------------------------------- |
| `examples` | `Iterable[Example]` | The `Example` objects holding both the predictions and the correct gold-standard annotations. | | `examples` | `Iterable[Example]` | The `Example` objects holding both the predictions and the correct gold-standard annotations. |
| `attr` | `str` | The attribute to score. | | `attr` | `str` | The attribute to score. |
| `getter` | `callable` | Defaults to `getattr`. If provided, `getter(token, attr)` should return the value of the attribute for an individual `Token`. | | `getter` | `callable` | Defaults to `getattr`. If provided, `getter(token, attr)` should return the value of the attribute for an individual `Token`. |
@ -96,7 +98,7 @@ Scores a single token attribute per feature for a token attribute in UFEATS form
Returns PRF scores for labeled or unlabeled spans. Returns PRF scores for labeled or unlabeled spans.
| Name | Type | Description | | Name | Type | Description |
| ----------- | --------- | --------------------------------------------------------------------------------------------------------| | ----------- | ------------------- | --------------------------------------------------------------------------------------------------------------------- |
| `examples` | `Iterable[Example]` | The `Example` objects holding both the predictions and the correct gold-standard annotations. | | `examples` | `Iterable[Example]` | The `Example` objects holding both the predictions and the correct gold-standard annotations. |
| `attr` | `str` | The attribute to score. | | `attr` | `str` | The attribute to score. |
| `getter` | `callable` | Defaults to `getattr`. If provided, `getter(doc, attr)` should return the `Span` objects for an individual `Doc`. | | `getter` | `callable` | Defaults to `getattr`. If provided, `getter(doc, attr)` should return the `Span` objects for an individual `Doc`. |
@ -107,27 +109,27 @@ Returns PRF scores for labeled or unlabeled spans.
Calculate the UAS, LAS, and LAS per type scores for dependency parses. Calculate the UAS, LAS, and LAS per type scores for dependency parses.
| Name | Type | Description | | Name | Type | Description |
| ----------- | --------- | --------------------------------------------------------------------------------------------------------| | --------------- | ------------------- | ----------------------------------------------------------------------------------------------------------------------------- |
| `examples` | `Iterable[Example]` | The `Example` objects holding both the predictions and the correct gold-standard annotations. | | `examples` | `Iterable[Example]` | The `Example` objects holding both the predictions and the correct gold-standard annotations. |
| `attr` | `str` | The attribute containing the dependency label. | | `attr` | `str` | The attribute containing the dependency label. |
| `getter` | `callable` | Defaults to `getattr`. If provided, `getter(token, attr)` should return the value of the attribute for an individual `Token`. | | `getter` | `callable` | Defaults to `getattr`. If provided, `getter(token, attr)` should return the value of the attribute for an individual `Token`. |
| `head_attr` | `str` | The attribute containing the head token. | | `head_attr` | `str` | The attribute containing the head token. |
| `head_getter` | `callable` | Defaults to `getattr`. If provided, `head_getter(token, attr)` should return the head for an individual `Token`. | | `head_getter` | `callable` | Defaults to `getattr`. If provided, `head_getter(token, attr)` should return the head for an individual `Token`. |
| `ignore_labels` | `Tuple` | Labels to ignore while scoring (e.g., `punct`). | `ignore_labels` | `Tuple` | Labels to ignore while scoring (e.g., `punct`). |
| **RETURNS** | `Dict` | A dictionary containing the scores: `attr_uas`, `attr_las`, and `attr_las_per_type`. | | **RETURNS** | `Dict` | A dictionary containing the scores: `attr_uas`, `attr_las`, and `attr_las_per_type`. |
## Scorer.score_cats {#score_cats tag="staticmethod"} ## Scorer.score_cats {#score_cats tag="staticmethod"}
Calculate PRF and ROC AUC scores for a doc-level attribute that is a dict Calculate PRF and ROC AUC scores for a doc-level attribute that is a dict
containing scores for each label like `Doc.cats`. containing scores for each label like `Doc.cats`. The reported overall score
depends on the scorer settings.
| Name | Type | Description | | Name | Type | Description |
| ----------- | --------- | --------------------------------------------------------------------------------------------------------| | ---------------- | ------------------- | ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
| `examples` | `Iterable[Example]` | The `Example` objects holding both the predictions and the correct gold-standard annotations. | | `examples` | `Iterable[Example]` | The `Example` objects holding both the predictions and the correct gold-standard annotations. |
| `attr` | `str` | The attribute to score. | | `attr` | `str` | The attribute to score. |
| `getter` | `callable` | Defaults to `getattr`. If provided, `getter(doc, attr)` should return the cats for an individual `Doc`. | | `getter` | `callable` | Defaults to `getattr`. If provided, `getter(doc, attr)` should return the cats for an individual `Doc`. |
| labels | `Iterable[str]` | The set of possible labels. Defaults to `[]`. | | labels | `Iterable[str]` | The set of possible labels. Defaults to `[]`. |
| multi_label | `bool` | Whether the attribute allows multiple labels. Defaults to `True`. | | `multi_label` | `bool` | Whether the attribute allows multiple labels. Defaults to `True`. |
| positive_label | `str` | The positive label for a binary task with exclusive classes. Defaults to `None`. | | `positive_label` | `str` | The positive label for a binary task with exclusive classes. Defaults to `None`. |
| **RETURNS** | `Dict` | A dictionary containing the scores: 1) for binary exclusive with positive label: `attr_p/r/f`; 2) for 3+ exclusive classes, macro-averaged fscore: `attr_macro_f`; 3) for multilabel, macro-averaged AUC: `attr_macro_auc`; 4) for all: `attr_f_per_type`, `attr_auc_per_type` | | **RETURNS** | `Dict` | A dictionary containing the scores, with inapplicable scores as `None`: 1) for all: `attr_score` (one of `attr_f` / `attr_macro_f` / `attr_macro_auc`), `attr_score_desc` (text description of the overall score), `attr_f_per_type`, `attr_auc_per_type`; 2) for binary exclusive with positive label: `attr_p/r/f`; 3) for 3+ exclusive classes, macro-averaged fscore: `attr_macro_f`; 4) for multilabel, macro-averaged AUC: `attr_macro_auc` |