spaCy/spacy/cli
Daniël de Kok da7ad97519
Update TextCatBOW to use the fixed SparseLinear layer (#13149)
* Update `TextCatBOW` to use the fixed `SparseLinear` layer

A while ago, we fixed the `SparseLinear` layer to use all available
parameters: https://github.com/explosion/thinc/pull/754

This change updates `TextCatBOW` to `v3` which uses the new
`SparseLinear_v2` layer. This results in a sizeable improvement on a
text categorization task that was tested.

While at it, this `spacy.TextCatBOW.v3` also adds the `length_exponent`
option to make it possible to change the hidden size. Ideally, we'd just
have an option called `length`. But the way that `TextCatBOW` uses
hashes results in a non-uniform distribution of parameters when the
length is not a power of two.

* Replace TexCatBOW `length_exponent` parameter by `length`

We now round up the length to the next power of two if it isn't
a power of two.

* Remove some tests for TextCatBOW.v2

* Fix missing import
2023-11-29 09:11:54 +01:00
..
project Restore spacy.cli.project API (#13053) 2023-10-10 15:35:25 +02:00
templates Update TextCatBOW to use the fixed SparseLinear layer (#13149) 2023-11-29 09:11:54 +01:00
__init__.py Restore spacy.cli.project API (#13053) 2023-10-10 15:35:25 +02:00
_util.py Remove pathy dependency, update docs for cloudpathlib in Weasel (#13035) 2023-10-05 08:50:22 +02:00
apply.py Always use tqdm with disable=None 2023-09-28 17:12:42 +02:00
assemble.py Tests for CLI app - init config generates train-able config (#12173) 2023-07-31 14:45:04 +02:00
benchmark_speed.py Always use tqdm with disable=None 2023-09-28 17:12:42 +02:00
convert.py Configure isort to use the Black profile, recursively isort the spacy module (#12721) 2023-06-14 17:48:41 +02:00
debug_config.py Configure isort to use the Black profile, recursively isort the spacy module (#12721) 2023-06-14 17:48:41 +02:00
debug_data.py Add spancat_singlelabel to debug data CLI (#12749) 2023-06-26 10:25:20 +02:00
debug_diff.py Configure isort to use the Black profile, recursively isort the spacy module (#12721) 2023-06-14 17:48:41 +02:00
debug_model.py Configure isort to use the Black profile, recursively isort the spacy module (#12721) 2023-06-14 17:48:41 +02:00
download.py Warn about reloading dependencies after downloading models (#13081) 2023-11-10 08:05:07 +01:00
evaluate.py add --spans-key option for CLI spancat evaluation (#12981) 2023-09-25 11:25:41 +02:00
find_function.py Add cli for finding locations of registered func (#12757) 2023-07-31 09:39:00 +02:00
find_threshold.py Tests for CLI app - init config generates train-able config (#12173) 2023-07-31 14:45:04 +02:00
info.py Configure isort to use the Black profile, recursively isort the spacy module (#12721) 2023-06-14 17:48:41 +02:00
init_config.py Configure isort to use the Black profile, recursively isort the spacy module (#12721) 2023-06-14 17:48:41 +02:00
init_pipeline.py Tests for CLI app - init config generates train-able config (#12173) 2023-07-31 14:45:04 +02:00
package.py Add preferred use of build for package CLI (#13109) 2023-11-08 17:35:24 +01:00
pretrain.py Configure isort to use the Black profile, recursively isort the spacy module (#12721) 2023-06-14 17:48:41 +02:00
profile.py Always use tqdm with disable=None 2023-09-28 17:12:42 +02:00
train.py Tests for CLI app - init config generates train-able config (#12173) 2023-07-31 14:45:04 +02:00
validate.py Configure isort to use the Black profile, recursively isort the spacy module (#12721) 2023-06-14 17:48:41 +02:00