mirror of
https://github.com/explosion/spaCy.git
synced 2024-12-27 18:36:36 +03:00
b5d999e510
* Add doc.cats to spacy.gold at the paragraph level Support `doc.cats` as `"cats": [{"label": string, "value": number}]` in the spacy JSON training format at the paragraph level. * `spacy.gold.docs_to_json()` writes `docs.cats` * `GoldCorpus` reads in cats in each `GoldParse` * Update instances of gold_tuples to handle cats Update iteration over gold_tuples / gold_parses to handle addition of cats at the paragraph level. * Add textcat to train CLI * Add textcat options to train CLI * Add textcat labels in `TextCategorizer.begin_training()` * Add textcat evaluation to `Scorer`: * For binary exclusive classes with provided label: F1 for label * For 2+ exclusive classes: F1 macro average * For multilabel (not exclusive): ROC AUC macro average (currently relying on sklearn) * Provide user info on textcat evaluation settings, potential incompatibilities * Provide pipeline to Scorer in `Language.evaluate` for textcat config * Customize train CLI output to include only metrics relevant to current pipeline * Add textcat evaluation to evaluate CLI * Fix handling of unset arguments and config params Fix handling of unset arguments and model confiug parameters in Scorer initialization. * Temporarily add sklearn requirement * Remove sklearn version number * Improve Scorer handling of models without textcats * Fixing Scorer handling of models without textcats * Update Scorer output for python 2.7 * Modify inf in Scorer for python 2.7 * Auto-format Also make small adjustments to make auto-formatting with black easier and produce nicer results * Move error message to Errors * Update documentation * Add cats to annotation JSON format [ci skip] * Fix tpl flag and docs [ci skip] * Switch to internal roc_auc_score Switch to internal `roc_auc_score()` adapted from scikit-learn. * Add AUCROCScore tests and improve errors/warnings * Add tests for AUCROCScore and roc_auc_score * Add missing error for only positive/negative values * Remove unnecessary warnings and errors * Make reduced roc_auc_score functions private Because most of the checks and warnings have been stripped for the internal functions and access is only intended through `ROCAUCScore`, make the functions for roc_auc_score adapted from scikit-learn private. * Check that data corresponds with multilabel flag Check that the training instances correspond with the multilabel flag, adding the multilabel flag if required. * Add textcat score to early stopping check * Add more checks to debug-data for textcat * Add example training data for textcat * Add more checks to textcat train CLI * Check configuration when extending base model * Fix typos * Update textcat example data * Provide licensing details and licenses for data * Remove two labels with no positive instances from jigsaw-toxic-comment data. Co-authored-by: Ines Montani <ines@ines.io>
122 lines
6.9 KiB
Plaintext
122 lines
6.9 KiB
Plaintext
Creative Commons Legal Code
|
|
|
|
CC0 1.0 Universal
|
|
|
|
CREATIVE COMMONS CORPORATION IS NOT A LAW FIRM AND DOES NOT PROVIDE
|
|
LEGAL SERVICES. DISTRIBUTION OF THIS DOCUMENT DOES NOT CREATE AN
|
|
ATTORNEY-CLIENT RELATIONSHIP. CREATIVE COMMONS PROVIDES THIS
|
|
INFORMATION ON AN "AS-IS" BASIS. CREATIVE COMMONS MAKES NO WARRANTIES
|
|
REGARDING THE USE OF THIS DOCUMENT OR THE INFORMATION OR WORKS
|
|
PROVIDED HEREUNDER, AND DISCLAIMS LIABILITY FOR DAMAGES RESULTING FROM
|
|
THE USE OF THIS DOCUMENT OR THE INFORMATION OR WORKS PROVIDED
|
|
HEREUNDER.
|
|
|
|
Statement of Purpose
|
|
|
|
The laws of most jurisdictions throughout the world automatically confer
|
|
exclusive Copyright and Related Rights (defined below) upon the creator
|
|
and subsequent owner(s) (each and all, an "owner") of an original work of
|
|
authorship and/or a database (each, a "Work").
|
|
|
|
Certain owners wish to permanently relinquish those rights to a Work for
|
|
the purpose of contributing to a commons of creative, cultural and
|
|
scientific works ("Commons") that the public can reliably and without fear
|
|
of later claims of infringement build upon, modify, incorporate in other
|
|
works, reuse and redistribute as freely as possible in any form whatsoever
|
|
and for any purposes, including without limitation commercial purposes.
|
|
These owners may contribute to the Commons to promote the ideal of a free
|
|
culture and the further production of creative, cultural and scientific
|
|
works, or to gain reputation or greater distribution for their Work in
|
|
part through the use and efforts of others.
|
|
|
|
For these and/or other purposes and motivations, and without any
|
|
expectation of additional consideration or compensation, the person
|
|
associating CC0 with a Work (the "Affirmer"), to the extent that he or she
|
|
is an owner of Copyright and Related Rights in the Work, voluntarily
|
|
elects to apply CC0 to the Work and publicly distribute the Work under its
|
|
terms, with knowledge of his or her Copyright and Related Rights in the
|
|
Work and the meaning and intended legal effect of CC0 on those rights.
|
|
|
|
1. Copyright and Related Rights. A Work made available under CC0 may be
|
|
protected by copyright and related or neighboring rights ("Copyright and
|
|
Related Rights"). Copyright and Related Rights include, but are not
|
|
limited to, the following:
|
|
|
|
i. the right to reproduce, adapt, distribute, perform, display,
|
|
communicate, and translate a Work;
|
|
ii. moral rights retained by the original author(s) and/or performer(s);
|
|
iii. publicity and privacy rights pertaining to a person's image or
|
|
likeness depicted in a Work;
|
|
iv. rights protecting against unfair competition in regards to a Work,
|
|
subject to the limitations in paragraph 4(a), below;
|
|
v. rights protecting the extraction, dissemination, use and reuse of data
|
|
in a Work;
|
|
vi. database rights (such as those arising under Directive 96/9/EC of the
|
|
European Parliament and of the Council of 11 March 1996 on the legal
|
|
protection of databases, and under any national implementation
|
|
thereof, including any amended or successor version of such
|
|
directive); and
|
|
vii. other similar, equivalent or corresponding rights throughout the
|
|
world based on applicable law or treaty, and any national
|
|
implementations thereof.
|
|
|
|
2. Waiver. To the greatest extent permitted by, but not in contravention
|
|
of, applicable law, Affirmer hereby overtly, fully, permanently,
|
|
irrevocably and unconditionally waives, abandons, and surrenders all of
|
|
Affirmer's Copyright and Related Rights and associated claims and causes
|
|
of action, whether now known or unknown (including existing as well as
|
|
future claims and causes of action), in the Work (i) in all territories
|
|
worldwide, (ii) for the maximum duration provided by applicable law or
|
|
treaty (including future time extensions), (iii) in any current or future
|
|
medium and for any number of copies, and (iv) for any purpose whatsoever,
|
|
including without limitation commercial, advertising or promotional
|
|
purposes (the "Waiver"). Affirmer makes the Waiver for the benefit of each
|
|
member of the public at large and to the detriment of Affirmer's heirs and
|
|
successors, fully intending that such Waiver shall not be subject to
|
|
revocation, rescission, cancellation, termination, or any other legal or
|
|
equitable action to disrupt the quiet enjoyment of the Work by the public
|
|
as contemplated by Affirmer's express Statement of Purpose.
|
|
|
|
3. Public License Fallback. Should any part of the Waiver for any reason
|
|
be judged legally invalid or ineffective under applicable law, then the
|
|
Waiver shall be preserved to the maximum extent permitted taking into
|
|
account Affirmer's express Statement of Purpose. In addition, to the
|
|
extent the Waiver is so judged Affirmer hereby grants to each affected
|
|
person a royalty-free, non transferable, non sublicensable, non exclusive,
|
|
irrevocable and unconditional license to exercise Affirmer's Copyright and
|
|
Related Rights in the Work (i) in all territories worldwide, (ii) for the
|
|
maximum duration provided by applicable law or treaty (including future
|
|
time extensions), (iii) in any current or future medium and for any number
|
|
of copies, and (iv) for any purpose whatsoever, including without
|
|
limitation commercial, advertising or promotional purposes (the
|
|
"License"). The License shall be deemed effective as of the date CC0 was
|
|
applied by Affirmer to the Work. Should any part of the License for any
|
|
reason be judged legally invalid or ineffective under applicable law, such
|
|
partial invalidity or ineffectiveness shall not invalidate the remainder
|
|
of the License, and in such case Affirmer hereby affirms that he or she
|
|
will not (i) exercise any of his or her remaining Copyright and Related
|
|
Rights in the Work or (ii) assert any associated claims and causes of
|
|
action with respect to the Work, in either case contrary to Affirmer's
|
|
express Statement of Purpose.
|
|
|
|
4. Limitations and Disclaimers.
|
|
|
|
a. No trademark or patent rights held by Affirmer are waived, abandoned,
|
|
surrendered, licensed or otherwise affected by this document.
|
|
b. Affirmer offers the Work as-is and makes no representations or
|
|
warranties of any kind concerning the Work, express, implied,
|
|
statutory or otherwise, including without limitation warranties of
|
|
title, merchantability, fitness for a particular purpose, non
|
|
infringement, or the absence of latent or other defects, accuracy, or
|
|
the present or absence of errors, whether or not discoverable, all to
|
|
the greatest extent permissible under applicable law.
|
|
c. Affirmer disclaims responsibility for clearing rights of other persons
|
|
that may apply to the Work or any use thereof, including without
|
|
limitation any person's Copyright and Related Rights in the Work.
|
|
Further, Affirmer disclaims responsibility for obtaining any necessary
|
|
consents, permissions or other rights required for any use of the
|
|
Work.
|
|
d. Affirmer understands and acknowledges that Creative Commons is not a
|
|
party to this document and has no duty or obligation with respect to
|
|
this CC0 or use of the Work.
|