* Add doc.cats to spacy.gold at the paragraph level
Support `doc.cats` as `"cats": [{"label": string, "value": number}]` in
the spacy JSON training format at the paragraph level.
* `spacy.gold.docs_to_json()` writes `docs.cats`
* `GoldCorpus` reads in cats in each `GoldParse`
* Update instances of gold_tuples to handle cats
Update iteration over gold_tuples / gold_parses to handle addition of
cats at the paragraph level.
* Add textcat to train CLI
* Add textcat options to train CLI
* Add textcat labels in `TextCategorizer.begin_training()`
* Add textcat evaluation to `Scorer`:
* For binary exclusive classes with provided label: F1 for label
* For 2+ exclusive classes: F1 macro average
* For multilabel (not exclusive): ROC AUC macro average (currently
relying on sklearn)
* Provide user info on textcat evaluation settings, potential
incompatibilities
* Provide pipeline to Scorer in `Language.evaluate` for textcat config
* Customize train CLI output to include only metrics relevant to current
pipeline
* Add textcat evaluation to evaluate CLI
* Fix handling of unset arguments and config params
Fix handling of unset arguments and model confiug parameters in Scorer
initialization.
* Temporarily add sklearn requirement
* Remove sklearn version number
* Improve Scorer handling of models without textcats
* Fixing Scorer handling of models without textcats
* Update Scorer output for python 2.7
* Modify inf in Scorer for python 2.7
* Auto-format
Also make small adjustments to make auto-formatting with black easier and produce nicer results
* Move error message to Errors
* Update documentation
* Add cats to annotation JSON format [ci skip]
* Fix tpl flag and docs [ci skip]
* Switch to internal roc_auc_score
Switch to internal `roc_auc_score()` adapted from scikit-learn.
* Add AUCROCScore tests and improve errors/warnings
* Add tests for AUCROCScore and roc_auc_score
* Add missing error for only positive/negative values
* Remove unnecessary warnings and errors
* Make reduced roc_auc_score functions private
Because most of the checks and warnings have been stripped for the
internal functions and access is only intended through `ROCAUCScore`,
make the functions for roc_auc_score adapted from scikit-learn private.
* Check that data corresponds with multilabel flag
Check that the training instances correspond with the multilabel flag,
adding the multilabel flag if required.
* Add textcat score to early stopping check
* Add more checks to debug-data for textcat
* Add example training data for textcat
* Add more checks to textcat train CLI
* Check configuration when extending base model
* Fix typos
* Update textcat example data
* Provide licensing details and licenses for data
* Remove two labels with no positive instances from jigsaw-toxic-comment
data.
Co-authored-by: Ines Montani <ines@ines.io>
* Improve NER per type scoring
* include all gold labels in per type scoring, not only when recall > 0
* improve efficiency of per type scoring
* Create Scorer tests, initially with NER tests
* move regression test #3968 (per type NER scoring) to Scorer tests
* add new test for per type NER scoring with imperfect P/R/F and per
type P/R/F including a case where R == 0.0
* Evaluation of NER model per entity type, closes ##3490
Now each ent score is tracked individually in order to have its own Precision, Recall and F1 Score
* Keep track of each entity individually using dicts
* Improving how to compute the scores for each entity
* Fixed bug computing scores for ents
* Formatting with black
* Added key ents_per_type to the scores function
The key `ents_per_type` contains the metrics Precision, Recall and F1-Score for each entity individually
<!--- Provide a general summary of your changes in the title. -->
## Description
- [x] Use [`black`](https://github.com/ambv/black) to auto-format all `.py` files.
- [x] Update flake8 config to exclude very large files (lemmatization tables etc.)
- [x] Update code to be compatible with flake8 rules
- [x] Fix various small bugs, inconsistencies and messy stuff in the language data
- [x] Update docs to explain new code style (`black`, `flake8`, when to use `# fmt: off` and `# fmt: on` and what `# noqa` means)
Once #2932 is merged, which auto-formats and tidies up the CLI, we'll be able to run `flake8 spacy` actually get meaningful results.
At the moment, the code style and linting isn't applied automatically, but I'm hoping that the new [GitHub Actions](https://github.com/features/actions) will let us auto-format pull requests and post comments with relevant linting information.
### Types of change
enhancement, code style
## Checklist
<!--- Before you submit the PR, go over this checklist and make sure you can
tick off all the boxes. [] -> [x] -->
- [x] I have submitted the spaCy Contributor Agreement.
- [x] I ran the tests, and all new and existing tests passed.
- [x] My changes don't require a change to the documentation, or if they do, I've added all required information.
* Add spacy.errors module
* Update deprecation and user warnings
* Replace errors and asserts with new error message system
* Remove redundant asserts
* Fix whitespace
* Add messages for print/util.prints statements
* Fix typo
* Fix typos
* Move CLI messages to spacy.cli._messages
* Add decorator to display error code with message
An implementation like this is nice because it only modifies the string when it's retrieved from the containing class – so we don't have to worry about manipulating tracebacks etc.
* Remove unused link in spacy.about
* Update errors for invalid pipeline components
* Improve error for unknown factories
* Add displaCy warnings
* Update formatting consistency
* Move error message to spacy.errors
* Update errors and check if doc returned by component is None