* implement textcat resizing for TextCatCNN
* resizing textcat in-place
* simplify code
* ensure predictions for old textcat labels remain the same after resizing (WIP)
* fix for softmax
* store softmax as attr
* fix ensemble weight copy and cleanup
* restructure slightly
* adjust documentation, update tests and quickstart templates to use latest versions
* extend unit test slightly
* revert unnecessary edits
* fix typo
* ensemble architecture won't be resizable for now
* use resizable layer (WIP)
* revert using resizable layer
* resizable container while avoid shape inference trouble
* cleanup
* ensure model continues training after resizing
* use fill_b parameter
* use fill_defaults
* resize_layer callback
* format
* bump thinc to 8.0.4
* bump spacy-legacy to 3.0.6
* Update Catalan language data
Update Catalan language data based on contributions from the Text Mining
Unit at the Barcelona Supercomputing Center:
https://github.com/TeMU-BSC/spacy4release/tree/main/lang_data
* Update tokenizer settings for UD Catalan AnCora
Update for UD Catalan AnCora v2.7 with merged multi-word tokens.
* Update test
* Move prefix patternt to more generic infix pattern
* Clean up
* Replace negative rows with 0 in StaticVectors
Replace negative row indices with 0-vectors in `StaticVectors`.
* Increase versions related to StaticVectors
* Increase versions of all architctures and layers related to
`StaticVectors`
* Improve efficiency of 0-vector operations
Parallel `spacy-legacy` PR: https://github.com/explosion/spacy-legacy/pull/5
* Update config defaults to new versions
* Update docs
* Set catalogue lower pin to v2.0.2
* Update importlib-metadata pins to match
* Require catalogue v2.0.3
Switch to vendored `importlib-metadata` v3.2.0 provided by `catalogue`.
* Allow output_path to be None during training
* Fix cat scoring (?)
* Improve error message for weighted None score
* Improve messages
So we can call this in other places etc.
* FIx output path check
* Use latest wasabi
* Revert "Improve error message for weighted None score"
This reverts commit 7059926763.
* Exclude None scores from final score by default
It's otherwise very difficult to keep track of the score weights if we modify a config programmatically, source components etc.
* Update warnings and use logger.warning
* Remove blis version constraints
After updating the blis sdist in v0.7.4, remove python version
constraints for blis build and install dependencies.
* Install sdist with --prefer-binary for python 3.5
* Fix duplicate sdist install steps
* Fix sdist install step types
* Fix blis pins in requirements.txt
* Remove wheel hack for python 3.5 from CI
* Fix blis build dependencies
* Add blis with python_version constraints to pyproject.toml
* Add blis to setup_requires
* Remove --only-binary from CI
* Reduce number of builds to speed up CI
* Add hack to install wheel for python 3.5 in linux
* Remove os spec from CI
* Remove detailed numpy build constraints
* Remove detailed numpy build constraints from `pyproject.toml` because
it is too difficult to maintain for many architectures
* These constraints are more a reflection of what is available on
pypi as binary wheels rather than any real build requirements that
it is necessary for users to follow when building from source
* Users building their own binary packages will need to enforce the
constraints that make sense in their environments, e.g., the `conda`
compatible numpy pins
* Keep the build constraints in `build-constraints.txt` for use with our
builds
* Our builds with wheelwright are built against the earliest
compatible binary versions of numpy on pypi
* These constraints are documented within the distribution
* Revert "Remove os spec from CI"
This reverts commit 7489476688.
* Dynamically include numpy headers
* Add `build-constraints.txt` with numpy version pins for building wheels with `pip` and `wheelwright`
* Update `setup.py` to add current numpy include directory
* Assume `cython` and `numpy` are installed for `setup.py`
* Remove included numpy headers
* Fix typo in requirements.txt
* Use script in CI
* Update blis and thinc version ranges
* Update thinc version range
* Update setup.cfg for python 3.9
* Adjust blis and thinc ranges
* Add python 3.9 classifier
* Update CI for python 3.9
* Add --prefer-binary to CI sdist install
* Update CI python 3.7 mac image
* Add --prefer-binary to Travis CI
* Update install instructions in README
* Specify blis versions separately for < / >= 3.6
* Update --prefer-binary in README
* Test cleaner sdist install
* Also upgrade pip
(This is kind of unnecessary given --prefer-binary but may avoid other
issues related to sdist installs in the future.)
* Compile with -j 2
* Remove wheel from setup_requires
* Update to have separate CI uninstall step
* Remove wheel from pyproject.toml
* Recommend upgrading setuptools in addition to pip
* Replace pytokenizations with internal alignment
Replace pytokenizations with internal alignment algorithm that is
restricted to only allow differences in whitespace and capitalization.
* Rename `spacy.training.align` to `spacy.training.alignment` to contain
the `Alignment` dataclass
* Implement `get_alignments` in `spacy.training.align`
* Refactor trailing whitespace handling
* Remove unnecessary exception for empty docs
Allow a non-empty whitespace-only doc to be aligned with an empty doc
* Remove empty docs exceptions completely
* Add `cuda110` to setup.cfg and quickstart dropdown
* Switch to `pip` for pip-only packages in conda quickstart instructions
* Update zh pkuseg install message with version range and conda
* Remove `zh` from `extras_require` because the default doesn't require
additional packages