From fdf09cb2313e18702b9e59d55a15a10394ca3612 Mon Sep 17 00:00:00 2001 From: Adriane Boyd Date: Mon, 27 Jul 2020 15:34:42 +0200 Subject: [PATCH] Update Scorer API docs for score_cats --- website/docs/api/scorer.md | 112 +++++++++++++++++++------------------ 1 file changed, 57 insertions(+), 55 deletions(-) diff --git a/website/docs/api/scorer.md b/website/docs/api/scorer.md index ef4396e1b..8daefd241 100644 --- a/website/docs/api/scorer.md +++ b/website/docs/api/scorer.md @@ -8,8 +8,8 @@ source: spacy/scorer.py The `Scorer` computes evaluation scores. It's typically created by [`Language.evaluate`](/api/language#evaluate). -In addition, the `Scorer` provides a number of evaluation methods for -evaluating `Token` and `Doc` attributes. +In addition, the `Scorer` provides a number of evaluation methods for evaluating +`Token` and `Doc` attributes. ## Scorer.\_\_init\_\_ {#init tag="method"} @@ -28,10 +28,10 @@ Create a new `Scorer`. > scorer = Scorer(nlp) > ``` -| Name | Type | Description | -| ------------ | -------- | ------------------------------------------------------------ | -| `nlp` | Language | The pipeline to use for scoring, where each pipeline component may provide a scoring method. If none is provided, then a default pipeline for the multi-language code `xx` is constructed containing: `senter`, `tagger`, `morphologizer`, `parser`, `ner`, `textcat`. | -| **RETURNS** | `Scorer` | The newly created object. | +| Name | Type | Description | +| ----------- | -------- | ---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | +| `nlp` | Language | The pipeline to use for scoring, where each pipeline component may provide a scoring method. If none is provided, then a default pipeline for the multi-language code `xx` is constructed containing: `senter`, `tagger`, `morphologizer`, `parser`, `ner`, `textcat`. | +| **RETURNS** | `Scorer` | The newly created object. | ## Scorer.score {#score tag="method"} @@ -39,13 +39,13 @@ Calculate the scores for a list of [`Example`](/api/example) objects using the scoring methods provided by the components in the pipeline. The returned `Dict` contains the scores provided by the individual pipeline -components. For the scoring methods provided by the `Scorer` and use by the -core pipeline components, the individual score names start with the `Token` or -`Doc` attribute being scored: `token_acc`, `token_p/r/f`, `sents_p/r/f`, -`tag_acc`, `pos_acc`, `morph_acc`, `morph_per_feat`, `lemma_acc`, `dep_uas`, -`dep_las`, `dep_las_per_type`, `ents_p/r/f`, `ents_per_type`, -`textcat_macro_auc`, `textcat_macro_f`. - +components. For the scoring methods provided by the `Scorer` and use by the core +pipeline components, the individual score names start with the `Token` or `Doc` +attribute being scored: `token_acc`, `token_p/r/f`, `sents_p/r/f`, `tag_acc`, +`pos_acc`, `morph_acc`, `morph_per_feat`, `lemma_acc`, `dep_uas`, `dep_las`, +`dep_las_per_type`, `ents_p/r/f`, `ents_per_type`, `textcat_macro_auc`, +`textcat_macro_f`. + > #### Example > > ```python @@ -53,19 +53,20 @@ core pipeline components, the individual score names start with the `Token` or > scorer.score(examples) > ``` -| Name | Type | Description | -| ----------- | --------- | --------------------------------------------------------------------------------------------------------| +| Name | Type | Description | +| ----------- | ------------------- | --------------------------------------------------------------------------------------------- | | `examples` | `Iterable[Example]` | The `Example` objects holding both the predictions and the correct gold-standard annotations. | | **RETURNS** | `Dict` | A dictionary of scores. | + ## Scorer.score_tokenization {#score_tokenization tag="staticmethod"} Scores the tokenization: -* `token_acc`: # correct tokens / # gold tokens -* `token_p/r/f`: PRF for token character spans +- `token_acc`: # correct tokens / # gold tokens +- `token_p/r/f`: PRF for token character spans -| Name | Type | Description | -| ----------- | --------- | --------------------------------------------------------------------------------------------------------| +| Name | Type | Description | +| ----------- | ------------------- | --------------------------------------------------------------------------------------------- | | `examples` | `Iterable[Example]` | The `Example` objects holding both the predictions and the correct gold-standard annotations. | | **RETURNS** | `Dict` | A dictionary containing the scores `token_acc/p/r/f`. | @@ -73,61 +74,62 @@ Scores the tokenization: Scores a single token attribute. -| Name | Type | Description | -| ----------- | --------- | --------------------------------------------------------------------------------------------------------| -| `examples` | `Iterable[Example]` | The `Example` objects holding both the predictions and the correct gold-standard annotations. | -| `attr` | `str` | The attribute to score. | +| Name | Type | Description | +| ----------- | ------------------- | ----------------------------------------------------------------------------------------------------------------------------- | +| `examples` | `Iterable[Example]` | The `Example` objects holding both the predictions and the correct gold-standard annotations. | +| `attr` | `str` | The attribute to score. | | `getter` | `callable` | Defaults to `getattr`. If provided, `getter(token, attr)` should return the value of the attribute for an individual `Token`. | -| **RETURNS** | `Dict` | A dictionary containing the score `attr_acc`. | +| **RETURNS** | `Dict` | A dictionary containing the score `attr_acc`. | ## Scorer.score_token_attr_per_feat {#score_token_attr_per_feat tag="staticmethod"} -Scores a single token attribute per feature for a token attribute in UFEATS format. +Scores a single token attribute per feature for a token attribute in UFEATS +format. -| Name | Type | Description | -| ----------- | --------- | --------------------------------------------------------------------------------------------------------| -| `examples` | `Iterable[Example]` | The `Example` objects holding both the predictions and the correct gold-standard annotations. | -| `attr` | `str` | The attribute to score. | +| Name | Type | Description | +| ----------- | ------------------- | ----------------------------------------------------------------------------------------------------------------------------- | +| `examples` | `Iterable[Example]` | The `Example` objects holding both the predictions and the correct gold-standard annotations. | +| `attr` | `str` | The attribute to score. | | `getter` | `callable` | Defaults to `getattr`. If provided, `getter(token, attr)` should return the value of the attribute for an individual `Token`. | -| **RETURNS** | `Dict` | A dictionary containing the per-feature PRF scores unders the key `attr_per_feat`. | +| **RETURNS** | `Dict` | A dictionary containing the per-feature PRF scores unders the key `attr_per_feat`. | ## Scorer.score_spans {#score_spans tag="staticmethod"} Returns PRF scores for labeled or unlabeled spans. -| Name | Type | Description | -| ----------- | --------- | --------------------------------------------------------------------------------------------------------| -| `examples` | `Iterable[Example]` | The `Example` objects holding both the predictions and the correct gold-standard annotations. | -| `attr` | `str` | The attribute to score. | -| `getter` | `callable` | Defaults to `getattr`. If provided, `getter(doc, attr)` should return the `Span` objects for an individual `Doc`. | +| Name | Type | Description | +| ----------- | ------------------- | --------------------------------------------------------------------------------------------------------------------- | +| `examples` | `Iterable[Example]` | The `Example` objects holding both the predictions and the correct gold-standard annotations. | +| `attr` | `str` | The attribute to score. | +| `getter` | `callable` | Defaults to `getattr`. If provided, `getter(doc, attr)` should return the `Span` objects for an individual `Doc`. | | **RETURNS** | `Dict` | A dictionary containing the PRF scores under the keys `attr_p/r/f` and the per-type PRF scores under `attr_per_type`. | ## Scorer.score_deps {#score_deps tag="staticmethod"} Calculate the UAS, LAS, and LAS per type scores for dependency parses. -| Name | Type | Description | -| ----------- | --------- | --------------------------------------------------------------------------------------------------------| -| `examples` | `Iterable[Example]` | The `Example` objects holding both the predictions and the correct gold-standard annotations. | -| `attr` | `str` | The attribute containing the dependency label. | -| `getter` | `callable` | Defaults to `getattr`. If provided, `getter(token, attr)` should return the value of the attribute for an individual `Token`. | -| `head_attr` | `str` | The attribute containing the head token. | -| `head_getter` | `callable` | Defaults to `getattr`. If provided, `head_getter(token, attr)` should return the head for an individual `Token`. | -| `ignore_labels` | `Tuple` | Labels to ignore while scoring (e.g., `punct`). -| **RETURNS** | `Dict` | A dictionary containing the scores: `attr_uas`, `attr_las`, and `attr_las_per_type`. | +| Name | Type | Description | +| --------------- | ------------------- | ----------------------------------------------------------------------------------------------------------------------------- | +| `examples` | `Iterable[Example]` | The `Example` objects holding both the predictions and the correct gold-standard annotations. | +| `attr` | `str` | The attribute containing the dependency label. | +| `getter` | `callable` | Defaults to `getattr`. If provided, `getter(token, attr)` should return the value of the attribute for an individual `Token`. | +| `head_attr` | `str` | The attribute containing the head token. | +| `head_getter` | `callable` | Defaults to `getattr`. If provided, `head_getter(token, attr)` should return the head for an individual `Token`. | +| `ignore_labels` | `Tuple` | Labels to ignore while scoring (e.g., `punct`). | +| **RETURNS** | `Dict` | A dictionary containing the scores: `attr_uas`, `attr_las`, and `attr_las_per_type`. | ## Scorer.score_cats {#score_cats tag="staticmethod"} Calculate PRF and ROC AUC scores for a doc-level attribute that is a dict -containing scores for each label like `Doc.cats`. - -| Name | Type | Description | -| ----------- | --------- | --------------------------------------------------------------------------------------------------------| -| `examples` | `Iterable[Example]` | The `Example` objects holding both the predictions and the correct gold-standard annotations. | -| `attr` | `str` | The attribute to score. | -| `getter` | `callable` | Defaults to `getattr`. If provided, `getter(doc, attr)` should return the cats for an individual `Doc`. | -| labels | `Iterable[str]` | The set of possible labels. Defaults to `[]`. | -| multi_label | `bool` | Whether the attribute allows multiple labels. Defaults to `True`. | -| positive_label | `str` | The positive label for a binary task with exclusive classes. Defaults to `None`. | -| **RETURNS** | `Dict` | A dictionary containing the scores: 1) for binary exclusive with positive label: `attr_p/r/f`; 2) for 3+ exclusive classes, macro-averaged fscore: `attr_macro_f`; 3) for multilabel, macro-averaged AUC: `attr_macro_auc`; 4) for all: `attr_f_per_type`, `attr_auc_per_type` | +containing scores for each label like `Doc.cats`. The reported overall score +depends on the scorer settings. +| Name | Type | Description | +| ---------------- | ------------------- | ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | +| `examples` | `Iterable[Example]` | The `Example` objects holding both the predictions and the correct gold-standard annotations. | +| `attr` | `str` | The attribute to score. | +| `getter` | `callable` | Defaults to `getattr`. If provided, `getter(doc, attr)` should return the cats for an individual `Doc`. | +| labels | `Iterable[str]` | The set of possible labels. Defaults to `[]`. | +| `multi_label` | `bool` | Whether the attribute allows multiple labels. Defaults to `True`. | +| `positive_label` | `str` | The positive label for a binary task with exclusive classes. Defaults to `None`. | +| **RETURNS** | `Dict` | A dictionary containing the scores, with inapplicable scores as `None`: 1) for all: `attr_score` (one of `attr_f` / `attr_macro_f` / `attr_macro_auc`), `attr_score_desc` (text description of the overall score), `attr_f_per_type`, `attr_auc_per_type`; 2) for binary exclusive with positive label: `attr_p/r/f`; 3) for 3+ exclusive classes, macro-averaged fscore: `attr_macro_f`; 4) for multilabel, macro-averaged AUC: `attr_macro_auc` |