* Convert all individual values explicitly to uint64 for array-based doc representations
* Temporarily test with latest numpy v1.24.0rc
* Remove unnecessary conversion from attr_t
* Reduce number of individual casts
* Convert specifically from int32 to uint64
* Revert "Temporarily test with latest numpy v1.24.0rc"
This reverts commit eb0e3c5006.
* Also use int32 in tests
If you don't have spacy-transformers installed, but try to use `init
config` with the GPU flag, you'll get an error. The issue is that the
`use_transformers` flag in the config is conflated with the GPU flag,
and then there's an attempt to access transformers config info that may
not exist.
There may be a better way to do this, but this stops the error.
* Clarify error when words are of wrong type
See #9437
* Update docs
* Use try/except
* Apply suggestions from code review
Co-authored-by: Sofie Van Landeghem <svlandeg@users.noreply.github.com>
Co-authored-by: Adriane Boyd <adrianeboyd@gmail.com>
Co-authored-by: Sofie Van Landeghem <svlandeg@users.noreply.github.com>
Co-authored-by: Adriane Boyd <adrianeboyd@gmail.com>
* Add section for spacy.cli.train.train
* Add link from training page to train function
* Ensure path in train helper
* Update docs
Co-authored-by: Ines Montani <ines@ines.io>
* Update for python 3.10
* Update mac image
* Update build constraints for python 3.10
* Add extras for cupy cuda 11.3-11.5
* Remove cupy-cuda115 extra
* Require thinc>=8.0.12
* Switch CI to windows-2019
* Skip mypy for python 3.10
* Raise an error when multiprocessing is used on a GPU
As reported in #5507, a confusing exception is thrown when
multiprocessing is used with a GPU model and the `fork` multiprocessing
start method:
cupy.cuda.runtime.CUDARuntimeError: cudaErrorInitializationError: initialization error
This change checks whether one of the models uses the GPU when
multiprocessing is used. If so, raise a friendly error message.
Even though multiprocessing can work on a GPU with the `spawn` method,
it quickly runs the GPU out-of-memory on real-world data. Also,
multiprocessing on a single GPU typically does not provide large
performance gains.
* Move GPU multiprocessing check to Language.pipe
* Warn rather than error when using multiprocessing with GPU models
* Improve GPU multiprocessing warning message.
Co-authored-by: Adriane Boyd <adrianeboyd@gmail.com>
* Reduce API assumptions
Co-authored-by: Adriane Boyd <adrianeboyd@gmail.com>
* Update spacy/language.py
* Update spacy/language.py
* Test that warning is thrown with GPU + multiprocessing
Co-authored-by: Adriane Boyd <adrianeboyd@gmail.com>
Co-authored-by: Sofie Van Landeghem <svlandeg@users.noreply.github.com>
* add custom protocols in spacy.ty
* add a test for the new types in spacy.ty
* import Example when type checking
* some type fixes
* put Protocol in compat
* revert update check back to hasattr
* runtime_checkable in compat as well
* Replace use_ops("numpy") by use_ops("cpu") in the parser
This ensures that the best available CPU implementation is chosen
(e.g. Thinc Apple Ops on macOS).
* Run spaCy tests with apple-thinc-ops on macOS
* Remove some old version refs in the docs
* Remove warning
* Update spacy/matcher/matcher.pyx
* Remove all references to the punctuation warning
Co-authored-by: Sofie Van Landeghem <svlandeg@users.noreply.github.com>
* Add the spacy.models_with_nvtx_range.v1 callback
This callback recursively adds NVTX ranges to the Models in each pipe in
a pipeline.
* Fix create_models_with_nvtx_range type signature
* NVTX range: wrap models of all trainable pipes jointly
This avoids that (sub-)models that are shared between pipes get wrapped
twice.
* NVTX range callback: make color configurable
Add forward_color and backprop_color options to set the color for the
NVTX range.
* Move create_models_with_nvtx_range to spacy.ml
* Update create_models_with_nvtx_range for thinc changes
with_nvtx_range now updates an existing node, rather than returning a
wrapper node. So, we can simply walk over the nodes and update them.
* NVTX: use after_pipeline_creation in example