* avoid nesting then flattening
* mypy fix
* Apply suggestions from code review
* Add type for indices
* Run full matrix for mypy
* Add back modified type: ignore
* Revert "Run full matrix for mypy"
This reverts commit e218873d04.
---------
Co-authored-by: Adriane Boyd <adrianeboyd@gmail.com>
* Use `build` instead of `python setup.py sdist`
* Remove in-place build with `setup.py`
* Remove `gpu` parameter and GPU tests
* Keep `architecture` and `num_build_jobs` in azure steps with CI
defaults
* Fix use of `num_build_jobs` parameters
* Remove now-unused `prefix` parameter
* Test imports and CLI before installing test requirements
* Remove `*.egg-info` directory in addition to source directory for an
warning-free `import spacy`
* Convert all individual values explicitly to uint64 for array-based doc representations
* Temporarily test with latest numpy v1.24.0rc
* Remove unnecessary conversion from attr_t
* Reduce number of individual casts
* Convert specifically from int32 to uint64
* Revert "Temporarily test with latest numpy v1.24.0rc"
This reverts commit eb0e3c5006.
* Also use int32 in tests
If you don't have spacy-transformers installed, but try to use `init
config` with the GPU flag, you'll get an error. The issue is that the
`use_transformers` flag in the config is conflated with the GPU flag,
and then there's an attempt to access transformers config info that may
not exist.
There may be a better way to do this, but this stops the error.
* Add github actions for slow and gpu tests
* change weekly GPU tests to also run slow tests, and change the time
* only run the tests if there were commits in the past day
* remove duplicate line
* add sent start/end token attributes to the docs
* let has_annotation work with IS_SENT_END
* elif instead of if
* add has_annotation test for sent attributes
* fix typo
* remove duplicate is_sent_start entry in docs
* Setup debug data for spancat
* Add check for missing labels
* Add low-level data warning error
* Improve logic when compiling the gold train data
* Implement check for negative examples
* Remove breakpoint
* Remove ws_ents and missing entity checks
* Fix mypy errors
* Make variable name spans_key consistent
* Rename pipeline -> component for consistency
* Account for missing labels per spans_key
* Cleanup variable names for consistency
* Improve brevity of conditional statements
* Remove unused variables
* Include spans_key as an argument for _get_examples
* Add a conditional check for spans_key
* Update spancat debug data based on new API
- Instead of using _get_labels_from_model(), I'm now using
_get_labels_from_spancat() (cf. https://github.com/explosion/spaCy/pull10079)
- The way information is displayed was also changed (text -> table)
* Rename model_labels to ensure mypy works
* Update wording on warning messages
Use "span type" instead of "entity type" in wording the warning messages.
This is because Spans aren't necessarily entities.
* Update component type into a Literal
This is to make it clear that the component parameter should only accept
either 'spancat' or 'ner'.
* Update checks to include actual model span_keys
Instead of looking at everything in the data, we only check those
span_keys from the actual spancat component. Instead of doing the filter
inside the for-loop, I just made another dictionary,
data_labels_in_component to hold this value.
* Update spacy/cli/debug_data.py
* Show label counts only when verbose is True
Co-authored-by: Adriane Boyd <adrianeboyd@gmail.com>
Co-authored-by: Adriane Boyd <adrianeboyd@gmail.com>