spaCy

mirror of https://github.com/explosion/spaCy.git synced 2024-11-10 19:57:17 +03:00

Author	SHA1	Message	Date
Elia Robyn Lake (Robyn Speer)	5b0b0ca809	Move WandB loggers into spacy-loggers (#9223 ) * factor out the WandB logger into spacy-loggers Signed-off-by: Elia Robyn Speer <gh@arborelia.net> * depend on spacy-loggers so they are available Signed-off-by: Elia Robyn Speer <gh@arborelia.net> * remove docs of spacy.WandbLogger.v2 (moved to spacy-loggers) Signed-off-by: Elia Robyn Speer <elia@explosion.ai> * Version number suggestions from code review Co-authored-by: Adriane Boyd <adrianeboyd@gmail.com> * update references to WandbLogger Signed-off-by: Elia Robyn Speer <elia@explosion.ai> * make order of deps more consistent Signed-off-by: Elia Robyn Speer <elia@explosion.ai> Co-authored-by: Elia Robyn Speer <elia@explosion.ai> Co-authored-by: Adriane Boyd <adrianeboyd@gmail.com>	2021-09-29 11:12:50 +02:00
Adriane Boyd	12ab49342c	Sync requirements in setup.cfg	2021-09-27 09:16:31 +02:00
Adriane Boyd	d74870d38c	Prepare for v3.1.3 (#9200 ) * Update thinc and spacy-legacy requirements * Set version to v3.1.3	2021-09-14 11:03:51 +02:00
Sofie Van Landeghem	632d8d4c35	bump thinc to 8.0.9 (#9133 )	2021-09-03 13:34:42 +02:00
Sofie Van Landeghem	a17b06d18b	allow typer 0.4 (#9089 )	2021-08-31 20:53:51 +10:00
Ines Montani	4cd052e81d	Include component factories in third-party dependencies resolver (#9009 ) * Include component factories in third-party dependencies resolver * Increment catalogue and update test	2021-08-25 14:58:01 +02:00
Sofie Van Landeghem	83e27d262e	negative tag annotation (#8731 ) * unit test to unlearn tag via negative annotation * bump thinc to 8.0.8	2021-07-19 14:39:11 +02:00
Sofie Van Landeghem	e7d747e3ee	TransitionBasedParser.v1 to legacy (#8586 ) * TransitionBasedParser.v1 to legacy * register sublayers * bump spacy-legacy to 3.0.7	2021-07-06 15:26:45 +02:00
Adriane Boyd	2fc67e2aeb	Require thinc >=8.0.7 (#8572 )	2021-07-01 16:55:09 +02:00
Adriane Boyd	86d01e9229	Tidy up with flake8: imports, comparisons, etc.	2021-06-28 12:08:15 +02:00
Matthew Honnibal	f9946154d9	Add SpanCategorizer component (#6747 ) * Draft spancat model * Add spancat model * Add test for extract_spans * Add extract_spans layer * Upd extract_spans * Add spancat model * Add test for spancat model * Upd spancat model * Update spancat component * Upd spancat * Update spancat model * Add quick spancat test * Import SpanCategorizer * Fix SpanCategorizer component * Import SpanGroup * Fix span extraction * Fix import * Fix import * Upd model * Update spancat models * Add scoring, update defaults * Update and add docs * Fix type * Update spacy/ml/extract_spans.py * Auto-format and fix import * Fix comment * Fix type * Fix type * Update website/docs/api/spancategorizer.md * Fix comment Co-authored-by: Sofie Van Landeghem <svlandeg@users.noreply.github.com> * Better defense Co-authored-by: Sofie Van Landeghem <svlandeg@users.noreply.github.com> * Fix labels list Co-authored-by: Sofie Van Landeghem <svlandeg@users.noreply.github.com> * Update spacy/ml/extract_spans.py Co-authored-by: Sofie Van Landeghem <svlandeg@users.noreply.github.com> * Update spacy/pipeline/spancat.py Co-authored-by: Sofie Van Landeghem <svlandeg@users.noreply.github.com> * Set annotations during update * Set annotations in spancat * fix imports in test * Update spacy/pipeline/spancat.py * replace MaxoutLogistic with LinearLogistic * fix config * various small fixes * remove set_annotations parameter in update * use our beloved tupley format with recent support for doc.spans * bugfix to allow renaming the default span_key (scores weren't showing up) * use different key in docs example * change defaults to better-working parameters from project (WIP) * register spacy.extract_spans.v1 for legacy purposes * Upd dev version so can build wheel * layers instead of architectures for smaller building blocks * Update website/docs/api/spancategorizer.md Co-authored-by: Adriane Boyd <adrianeboyd@gmail.com> * Update website/docs/api/spancategorizer.md Co-authored-by: Adriane Boyd <adrianeboyd@gmail.com> * Include additional scores from overrides in combined score weights * Parameterize spans key in scoring Parameterize the `SpanCategorizer` `spans_key` for scoring purposes so that it's possible to evaluate multiple `spancat` components in the same pipeline. * Use the (intentionally very short) default spans key `sc` in the `SpanCategorizer` * Adjust the default score weights to include the default key * Adjust the scorer to use `spans_{spans_key}` as the prefix for the returned score * Revert addition of `attr_name` argument to `score_spans` and adjust the key in the `getter` instead. Note that for `spancat` components with a custom `span_key`, the score weights currently need to be modified manually in `[training.score_weights]` for them to be available during training. To suppress the default score weights `spans_sc_p/r/f` during training, set them to `null` in `[training.score_weights]`. * Update website/docs/api/scorer.md * Fix scorer for spans key containing underscore * Increment version * Add Spans to Evaluate CLI (#8439) * Add Spans to Evaluate CLI * Change to spans_key * Add spans per_type output Co-authored-by: Adriane Boyd <adrianeboyd@gmail.com> * Fix spancat GPU issues (#8455) * Fix GPU issues * Require thinc >=8.0.6 * Switch to glorot_uniform_init * Fix and test ngram suggester * Include final ngram in doc for all sizes * Fix ngrams for docs of the same length as ngram size * Handle batches of docs that result in no ngrams * Add tests Co-authored-by: Ines Montani <ines@ines.io> Co-authored-by: Sofie Van Landeghem <svlandeg@users.noreply.github.com> Co-authored-by: svlandeg <sofie.vanlandeghem@gmail.com> Co-authored-by: Adriane Boyd <adrianeboyd@gmail.com> Co-authored-by: Nirant <NirantK@users.noreply.github.com>	2021-06-24 12:35:27 +02:00
Adriane Boyd	994bed2fe2	Update dependencies (#8409 ) * Require `thinc>=8.0.5` * Use `spacy-lookups-data>=1.0.2`	2021-06-16 19:50:28 +02:00
Sofie Van Landeghem	e796aab4b3	Resizable textcat (#7862 ) * implement textcat resizing for TextCatCNN * resizing textcat in-place * simplify code * ensure predictions for old textcat labels remain the same after resizing (WIP) * fix for softmax * store softmax as attr * fix ensemble weight copy and cleanup * restructure slightly * adjust documentation, update tests and quickstart templates to use latest versions * extend unit test slightly * revert unnecessary edits * fix typo * ensemble architecture won't be resizable for now * use resizable layer (WIP) * revert using resizable layer * resizable container while avoid shape inference trouble * cleanup * ensure model continues training after resizing * use fill_b parameter * use fill_defaults * resize_layer callback * format * bump thinc to 8.0.4 * bump spacy-legacy to 3.0.6	2021-06-16 11:45:00 +02:00
Adriane Boyd	5646fcbe46	Merge remote-tracking branch 'upstream/develop' into chore/develop-into-master-v3.1	2021-06-15 15:05:17 +02:00
Adriane Boyd	b98d216205	Update Catalan language data (#8308 ) * Update Catalan language data Update Catalan language data based on contributions from the Text Mining Unit at the Barcelona Supercomputing Center: https://github.com/TeMU-BSC/spacy4release/tree/main/lang_data * Update tokenizer settings for UD Catalan AnCora Update for UD Catalan AnCora v2.7 with merged multi-word tokens. * Update test * Move prefix patternt to more generic infix pattern * Clean up	2021-06-11 10:21:22 +02:00
Adriane Boyd	6d2789452e	Restrict cython to <3.0 (#8337 )	2021-06-10 11:03:30 +02:00
Michael K	b0467d2972	Add project urls to package metadata (#7728 ) This adds the links to PyPI. To see that in action check out https://pypi.org/project/Django/ (source code: `b8c9e9fae1/setup.cfg (L27-L32)`)	2021-05-31 18:38:29 +10:00
Sofie Van Landeghem	fc37715cfb	ensure 'spacy ray' works (#7799 ) * ensure 'spacy ray' works * better fix by changing entry point	2021-05-28 18:15:31 +02:00
Adriane Boyd	06324e5a5e	Update pydantic requirements (#8127 ) Update pydantic requirements following https://github.com/explosion/thinc/pull/499	2021-05-18 11:35:50 +02:00
Adriane Boyd	cf032ec31e	Update to catalogue>=2.0.4 (#7951 )	2021-04-29 19:11:28 +02:00
Adriane Boyd	f4080983ea	Extend to cupy 9.0.0 (#7914 )	2021-04-28 10:18:24 +02:00
Adriane Boyd	946a4284be	Set spacy-legacy to >=3.0.5 (#7897 ) Set `spacy-legacy` to `>=3.0.5` due to `spacy.StaticVectors.v1` init bug.	2021-04-26 18:25:39 +02:00
Adriane Boyd	874cd02539	Set spacy-legacy to >=3.0.5 (#7897 ) Set `spacy-legacy` to `>=3.0.5` due to `spacy.StaticVectors.v1` init bug.	2021-04-26 17:06:32 +02:00
Adriane Boyd	df3444421a	Update spacy-legacy to >=3.0.4 (#7865 )	2021-04-23 12:16:12 +02:00
Sofie Van Landeghem	cfad7e21d5	fix config parsing of ints/strings (#7755 ) * add few failing tests for parsing integers and strings * bump thinc to 8.0.3	2021-04-22 18:09:13 +10:00
Adriane Boyd	d2bdaa7823	Replace negative rows with 0 in StaticVectors (#7674 ) * Replace negative rows with 0 in StaticVectors Replace negative row indices with 0-vectors in `StaticVectors`. * Increase versions related to StaticVectors * Increase versions of all architctures and layers related to `StaticVectors` * Improve efficiency of 0-vector operations Parallel `spacy-legacy` PR: https://github.com/explosion/spacy-legacy/pull/5 * Update config defaults to new versions * Update docs	2021-04-22 18:04:15 +10:00
Adriane Boyd	15bd230413	Set catalogue lower pin to v2.0.3 (#7762 ) * Set catalogue lower pin to v2.0.2 * Update importlib-metadata pins to match * Require catalogue v2.0.3 Switch to vendored `importlib-metadata` v3.2.0 provided by `catalogue`.	2021-04-19 18:37:17 +10:00
Sofie Van Landeghem	8d7af5b2b1	Ensure hyphen in config file works as string value (#7642 ) * add test for serializing '-' in a config file * bump srsly to 2.4.1	2021-04-12 14:35:57 +02:00
Ayush Chaurasia	3c2ce41dd8	W&B integration: Optional support for dataset and model checkpoint logging and versioning (#7429 ) * Add optional artifacts logging * Update docs * Update spacy/training/loggers.py Co-authored-by: Sofie Van Landeghem <svlandeg@users.noreply.github.com> * Update spacy/training/loggers.py Co-authored-by: Sofie Van Landeghem <svlandeg@users.noreply.github.com> * Update spacy/training/loggers.py Co-authored-by: Sofie Van Landeghem <svlandeg@users.noreply.github.com> * Bump WandbLogger Version * Add documentation of v1 to legacy docs * bump spacy-legacy to 3.0.2 (to be released) Co-authored-by: Sofie Van Landeghem <svlandeg@users.noreply.github.com> Co-authored-by: svlandeg <sofie.vanlandeghem@gmail.com>	2021-04-01 19:36:23 +02:00
Santiago Castro	af07fc3bc1	Add support for CUDA 11.2 (#7583 ) * Add support for CUDA 11.2 * Update the docs * Format Co-authored-by: Adriane Boyd <adrianeboyd@gmail.com>	2021-03-30 09:47:33 +02:00
Adriane Boyd	53a3b967ac	Update thinc pin and set version to v3.0.5 (#7389 )	2021-03-10 11:10:53 +01:00
svlandeg	b1945f4e73	sync pins with thinc	2021-03-02 12:06:59 +01:00
Matthew Honnibal	91a3cab1ca	Require spacy-transformers 1.0.1 for v3.0.1	2021-02-02 20:46:56 +11:00
Ines Montani	8a245076c4	Update spacy-transformers pin [ci skip]	2021-02-01 22:04:07 +11:00
Ines Montani	91e24d2b55	Update srsly pin	2021-02-01 18:24:58 +11:00
Ines Montani	95e958a229	Merge pull request #6852 from explosion/feature/replace-listeners	2021-01-30 00:58:08 +11:00
Ines Montani	756b49c184	Update spacy-lookups-data pin	2021-01-30 00:07:49 +11:00
Ines Montani	7ba29f2d03	Update spacy-transformers pin	2021-01-30 00:06:07 +11:00
Ines Montani	4a6fecd6df	Update spacy-legacy pin	2021-01-27 13:31:31 +11:00
Ines Montani	c0926c9088	WIP: Various small training changes (#6818 ) * Allow output_path to be None during training * Fix cat scoring (?) * Improve error message for weighted None score * Improve messages So we can call this in other places etc. * FIx output path check * Use latest wasabi * Revert "Improve error message for weighted None score" This reverts commit `7059926763`. * Exclude None scores from final score by default It's otherwise very difficult to keep track of the score weights if we modify a config programmatically, source components etc. * Update warnings and use logger.warning	2021-01-26 14:51:52 +11:00
Matthew Honnibal	c54c300680	Use thinc v8.0.0	2021-01-21 23:51:35 +11:00
Ines Montani	d1338966ae	Require spacy-legacy	2021-01-15 21:59:06 +11:00
Ines Montani	e99cd82367	Update version pins	2020-12-17 10:21:08 +11:00
Ines Montani	85ca8c2bdd	Merge branch 'master' into develop	2020-12-11 13:44:41 +11:00
Adriane Boyd	27bb75e2a0	Docs and extras updates for v2.3.5 * Update install instructions for updated packages * Add `cuda110` and `cuda111` extras, remove upper `cupy` pins (only compatible with `thinc>=7.4.4`)	2020-12-10 15:34:34 +01:00
Ines Montani	05a2812ae0	Merge branch 'develop' into pr/6444	2020-12-09 11:04:03 +11:00
Adriane Boyd	df4891bed1	Remove blis python version constraints (#6522 ) * Remove blis version constraints After updating the blis sdist in v0.7.4, remove python version constraints for blis build and install dependencies. * Install sdist with --prefer-binary for python 3.5 * Fix duplicate sdist install steps * Fix sdist install step types * Fix blis pins in requirements.txt * Remove wheel hack for python 3.5 from CI	2020-12-08 15:25:19 +01:00
Sofie Van Landeghem	2c27093c5f	require_cpu functionality (#6336 ) * add require_cpu from Thinc 8.0.0rc2 * add docs * fix test if cupy is not installed	2020-12-08 14:42:40 +08:00
Adriane Boyd	dcecc75270	Improve blis and numpy build dependencies (#6455 ) * Fix blis build dependencies * Add blis with python_version constraints to pyproject.toml * Add blis to setup_requires * Remove --only-binary from CI * Reduce number of builds to speed up CI * Add hack to install wheel for python 3.5 in linux * Remove os spec from CI * Remove detailed numpy build constraints * Remove detailed numpy build constraints from `pyproject.toml` because it is too difficult to maintain for many architectures * These constraints are more a reflection of what is available on pypi as binary wheels rather than any real build requirements that it is necessary for users to follow when building from source * Users building their own binary packages will need to enforce the constraints that make sense in their environments, e.g., the `conda` compatible numpy pins * Keep the build constraints in `build-constraints.txt` for use with our builds * Our builds with wheelwright are built against the earliest compatible binary versions of numpy on pypi * These constraints are documented within the distribution * Revert "Remove os spec from CI" This reverts commit `7489476688`.	2020-12-08 14:29:34 +08:00
Adriane Boyd	724831b066	Merge remote-tracking branch 'upstream/master' into chore/update-develop-from-master * Update Macedonian for v3 * Update Turkish for v3	2020-11-25 11:49:34 +01:00

1 2 3 4

172 Commits