spaCy

mirror of https://github.com/explosion/spaCy.git synced 2026-03-05 12:21:27 +03:00

History

Lj Miranda 42072f4468 Add spancat pipeline in spacy debug data (#10070 ) * Setup debug data for spancat * Add check for missing labels * Add low-level data warning error * Improve logic when compiling the gold train data * Implement check for negative examples * Remove breakpoint * Remove ws_ents and missing entity checks * Fix mypy errors * Make variable name spans_key consistent * Rename pipeline -> component for consistency * Account for missing labels per spans_key * Cleanup variable names for consistency * Improve brevity of conditional statements * Remove unused variables * Include spans_key as an argument for _get_examples * Add a conditional check for spans_key * Update spancat debug data based on new API - Instead of using _get_labels_from_model(), I'm now using _get_labels_from_spancat() (cf. https://github.com/explosion/spaCy/pull10079) - The way information is displayed was also changed (text -> table) * Rename model_labels to ensure mypy works * Update wording on warning messages Use "span type" instead of "entity type" in wording the warning messages. This is because Spans aren't necessarily entities. * Update component type into a Literal This is to make it clear that the component parameter should only accept either 'spancat' or 'ner'. * Update checks to include actual model span_keys Instead of looking at everything in the data, we only check those span_keys from the actual spancat component. Instead of doing the filter inside the for-loop, I just made another dictionary, data_labels_in_component to hold this value. * Update spacy/cli/debug_data.py * Show label counts only when verbose is True Co-authored-by: Adriane Boyd <adrianeboyd@gmail.com> Co-authored-by: Adriane Boyd <adrianeboyd@gmail.com>		2022-02-07 15:03:36 +01:00
..
cli	Add spancat pipeline in spacy debug data (#10070 )	2022-02-07 15:03:36 +01:00
displacy	Rename FACILITY to FAC in color list (#10067 )	2022-01-20 12:00:28 +01:00
lang	fix: Add missing comma to `_eleven_to_beyond` (#10166 )	2022-01-30 16:45:06 +09:00
matcher	Update typing hints (#10109 )	2022-01-28 16:59:54 +01:00
ml	Auto-format code with black (#10209 )	2022-02-06 16:30:30 +01:00
pipeline	Auto-format code with black (#10209 )	2022-02-06 16:30:30 +01:00
tests	Fix debug data check for ents that cross sents (#10188 )	2022-02-07 08:53:30 +01:00
tokens	Clarify Span.ents documentation (#10154 )	2022-01-31 08:41:42 +01:00
training	User fewer Vector internals (#9879 )	2022-01-18 17:14:35 +01:00
__init__.pxd	* Seems to be working after refactor. Need to wire up more POS tag features, and wire up save/load of POS tags.	2014-10-24 02:23:42 +11:00
__init__.py	Tidy up and auto-format	2021-07-18 15:44:56 +10:00
__main__.py	Tidy up	2020-06-22 00:45:40 +02:00
about.py	Set version to v3.2.1 (#9823 )	2021-12-07 10:51:45 +01:00
attrs.pxd	Merge branch 'develop' into master-tmp	2020-05-21 18:39:06 +02:00
attrs.pyx	Intify IOB (#9738 )	2022-01-20 13:19:38 +01:00
compat.py	Custom component types in spacy.ty (#9469 )	2021-10-21 15:31:06 +02:00
default_config_pretraining.cfg	Add new parameter for saving every n epoch in pretraining (#8912 )	2021-08-12 11:14:48 +02:00
default_config.cfg	Add a few docs to the default_config.cfg (#9981 )	2022-01-05 09:16:40 +01:00
errors.py	Intify IOB (#9738 )	2022-01-20 13:19:38 +01:00
glossary.py	Add glossary entry for _SP (#8983 )	2021-08-20 12:04:02 +02:00
kb.pxd	Replace cpdef variables with cdef (#7834 )	2021-04-26 16:54:02 +02:00
kb.pyx	Merge remote-tracking branch 'upstream/master' into chore/update-develop-from-master-v3.2-1	2021-10-26 11:53:50 +02:00
language.py	Auto-format code with black (#10209 )	2022-02-06 16:30:30 +01:00
lexeme.pxd	Fix Lexeme.from_ptr	2020-08-10 16:43:37 +02:00
lexeme.pyi	fix type of lexeme.rank (#9979 )	2022-01-04 13:15:25 +01:00
lexeme.pyx	Bugfix for similarity return types (#10051 )	2022-01-20 11:40:46 +01:00
lookups.py	🏷 Add Mypy check to CI and ignore all existing Mypy errors (#9167 )	2021-10-14 15:21:40 +02:00
morphology.pxd	Clean up Morphology imports and definitions (#7441 )	2021-04-26 16:54:23 +02:00
morphology.pyx	Clean up Morphology imports and definitions (#7441 )	2021-04-26 16:54:23 +02:00
parts_of_speech.pxd	Add support for Universal Dependencies v2.0	2017-03-03 13:17:34 +01:00
parts_of_speech.pyx	Drop Python 2.7 and 3.5 (#4828 )	2019-12-22 01:53:56 +01:00
pipe_analysis.py	🏷 Add Mypy check to CI and ignore all existing Mypy errors (#9167 )	2021-10-14 15:21:40 +02:00
py.typed	Add py.typed	2021-03-16 09:48:31 +01:00
schemas.py	Add ENT_IOB key to Matcher (#9649 )	2022-01-20 13:18:39 +01:00
scorer.py	Allow Scorer.score_spans to handle pred docs with missing annotation (#9701 )	2021-11-23 15:17:19 +01:00
strings.pxd	Update Cython string types (#9143 )	2021-09-13 17:02:17 +02:00
strings.pyi	🏷 Add Mypy check to CI and ignore all existing Mypy errors (#9167 )	2021-10-14 15:21:40 +02:00
strings.pyx	Update Cython string types (#9143 )	2021-09-13 17:02:17 +02:00
structs.pxd	Add SpanGroup and Graph container types to represent arbitrary annotations (#6696 )	2021-01-14 17:30:41 +11:00
symbols.pxd	introduce token.has_head and refer to MISSING_DEP_ (WIP)	2021-01-12 17:17:06 +01:00
symbols.pyx	introduce token.has_head and refer to MISSING_DEP_ (WIP)	2021-01-12 17:17:06 +01:00
tokenizer.pxd	Remove two attributes marked for removal in 3.1 (#9150 )	2021-09-15 23:07:21 +02:00
tokenizer.pyx	Fix infix as prefix in Tokenizer.explain (#10140 )	2022-01-28 17:00:54 +01:00
ty.py	Custom component types in spacy.ty (#9469 )	2021-10-21 15:31:06 +02:00
typedefs.pxd	Merge remote-tracking branch 'upstream/master' into chore/update-develop-from-master	2020-11-25 11:49:34 +01:00
typedefs.pyx	Tidy up rest	2017-10-27 21:07:59 +02:00
util.py	Fix references to config file in the docs & UX (#9961 )	2022-01-04 14:31:26 +01:00
vectors.pyx	User fewer Vector internals (#9879 )	2022-01-18 17:14:35 +01:00
vocab.pxd	Add support for floret vectors (#8909 )	2021-10-27 14:08:31 +02:00
vocab.pyi	Merge remote-tracking branch 'upstream/master' into chore/update-develop-from-master-v3.2-1	2021-10-26 11:53:50 +02:00
vocab.pyx	User fewer Vector internals (#9879 )	2022-01-18 17:14:35 +01:00