spaCy

mirror of https://github.com/explosion/spaCy.git synced 2025-09-11 22:52:39 +03:00

Author	SHA1	Message	Date
richardpaulhudson	79b2843a3f	Simple changes based on review comments	2022-12-12 11:10:10 +01:00
richardpaulhudson	ec1426700e	Avoid memcpy by writing directly to numpy data buf	2022-11-11 08:45:58 +01:00
richardpaulhudson	42f8563d0d	Remove unnecessary variable defintiion	2022-11-10 11:40:19 +01:00
richardpaulhudson	5b29568fb7	Fix wild pointer problem	2022-11-10 11:37:03 +01:00
richardpaulhudson	54bdc11353	Merge branch 'master' of https://github.com/explosion/spaCy into feature/etl	2022-11-09 12:24:36 +01:00
richardpaulhudson	999c0fc6c6	Format with black	2022-11-09 11:43:17 +01:00
richardpaulhudson	6a5b671261	Add full stop	2022-11-09 11:41:52 +01:00
richardpaulhudson	35d0c217d2	Final touches	2022-11-09 11:40:54 +01:00
Adriane Boyd	03eebe9d1c	Update warning, add tests for project requirements check (#11777 ) * Update warning, add tests for project requirements check * Make warning more general for differences between PEP 508 and pip * Add tests for _check_requirements * Parameterize test	2022-11-09 10:59:28 +01:00
Raphael Mitsch	20bbbe3e44	Revert disable/disabled merging behavior (#11745 ) * Merge disable with disabled. Adjust warnings, errors and tests. * Replace any() with set operation. * Update spacy/tests/pipeline/test_pipe_methods.py Co-authored-by: Adriane Boyd <adrianeboyd@gmail.com> * Update docs. * Remve reference to config entry nlp.enabled from docs. Co-authored-by: Adriane Boyd <adrianeboyd@gmail.com>	2022-11-08 14:58:10 +01:00
Adriane Boyd	2e3cfd758e	Use python 3.10 for GHA universe alert (#11768 )	2022-11-08 12:46:19 +09:00
Adriane Boyd	e116395f89	Add fallback in requirements check, only check once (#11735 ) * Add fallback in requirements check, only check once * Rename to skip_requirements_check * Update spacy/cli/project/run.py Co-authored-by: Paul O'Leary McCann <polm@dampfkraft.com> Co-authored-by: Paul O'Leary McCann <polm@dampfkraft.com>	2022-11-07 14:46:08 +01:00
Adriane Boyd	6105f20d8a	Switch CI to python 3.11 (#11765 )	2022-11-07 13:25:40 +01:00
Adriane Boyd	e91b47a226	Check for unsafe paths in tarfile.extractall (CVE-2007-4559) (#11746 ) * Adding tarfile member sanitization to extractall() * Format * Simplify and add error message * Fix import * Add comment about CVE Co-authored-by: TrellixVulnTeam <charles.mcfarland@trellix.com>	2022-11-07 10:43:34 +01:00
Paul O'Leary McCann	b76222e56a	Raise Typer limit (#11720 ) * Raise typer limit to <0.7.0 * Raise limit to <0.8.0	2022-11-07 08:11:55 +01:00
Adriane Boyd	ea326cf47d	Fix types for Span.id and Span.id_ (#11744 )	2022-11-07 08:11:13 +01:00
richardpaulhudson	a972791c9a	Removed extraneous import	2022-11-04 17:47:04 +01:00
richardpaulhudson	6e069c91f6	Correct .pyi file	2022-11-04 12:50:07 +01:00
richardpaulhudson	28a93fd3e3	Another correction	2022-11-04 12:44:22 +01:00
richardpaulhudson	8d703963d3	Correct error	2022-11-04 12:40:03 +01:00
richardpaulhudson	f97d6e6826	Updated example config	2022-11-04 12:36:14 +01:00
richardpaulhudson	dcfc810033	Remove extraneous import	2022-11-04 11:31:18 +01:00
github-actions[bot]	bbf64cfc43	Auto-format code with black (#11749 ) Co-authored-by: explosion-bot <explosion-bot@users.noreply.github.com>	2022-11-04 11:17:43 +01:00
richardpaulhudson	750628a623	Fix mypy problem	2022-11-04 11:00:33 +01:00
richardpaulhudson	f0dc60691a	Switch to 64-bit hashes	2022-11-04 10:17:25 +01:00
richardpaulhudson	7f1873ad81	Everything working after refactoring	2022-11-04 09:33:06 +01:00
richardpaulhudson	5d210a0f3b	Tidy up code	2022-11-03 21:26:47 +01:00
richardpaulhudson	aaaed55459	Save end_search_idx in variable	2022-11-03 21:06:37 +01:00
richard@explosion.ai	5d32dd6246	Intermediate state	2022-11-03 20:54:07 +01:00
richard@explosion.ai	7db2770c05	Intermediate state	2022-11-03 15:23:50 +01:00
richard@explosion.ai	b462f85a73	Correction	2022-11-03 13:37:53 +01:00
Adriane Boyd	40e1000db0	Restore Doc attr getter values in Doc.to_json (#11700 )	2022-11-03 11:49:08 +01:00
richard@explosion.ai	c7a960f19e	Performance improvement	2022-11-03 11:17:07 +01:00
Paul O'Leary McCann	db56600536	Fix default parameters for load functions (fix #11706 ) (#11713 ) * Fix default parameters for load functions Some load functions used SimpleFrozenList() directly instead of the _DEFAULT_EMPTY_PIPES parameter. That mostly worked as intended, but the changes in #11459 check for equality using identity, not value, so a warning is incorrectly raised sometimes, as in #11706. This change just has all the load functions use the singleton value instead. * Add test that there are no warnings on module-based load This will succeed due to changes in this branch, but local tests with the latest release failed as intended. * Try reverting commit and see if CI changes There is an error in CI that is probably unrelated. Revert "Fix default parameters for load functions" This reverts commit `dc46b35687`. * Revert "Try reverting commit and see if CI changes" This reverts commit `2514ed07ef`. Co-authored-by: Adriane Boyd <adrianeboyd@gmail.com>	2022-11-03 10:52:59 +01:00
richard@explosion.ai	deba504173	Add FNV1A conformity tests	2022-11-03 10:19:38 +01:00
Adriane Boyd	1211552f0e	Modernize and simplify CI steps (#11738 ) * Use `build` instead of `python setup.py sdist` * Remove in-place build with `setup.py` * Remove `gpu` parameter and GPU tests * Keep `architecture` and `num_build_jobs` in azure steps with CI defaults * Fix use of `num_build_jobs` parameters * Remove now-unused `prefix` parameter * Test imports and CLI before installing test requirements * Remove `.egg-info` directory in addition to source directory for an warning-free `import spacy` Switch `thinc-apple-ops` test to python 3.11 (as most recent python that is tested across platforms)	2022-11-03 09:29:46 +01:00
richard@explosion.ai	557799358c	Switch to FNV1A hashing	2022-11-02 20:04:43 +01:00
richard@explosion.ai	e7626f423a	Generate Numpy array at end	2022-11-02 17:11:20 +01:00
Ryn Daniels	2fb7e4dc74	More version updates for github action deprecation warnings (#11705 ) * More version updates for github action deprecation warnings * fix the deprecated set-output commands * bump explosion-bot to run on ubuntu-latest	2022-11-02 15:36:30 +01:00
Adriane Boyd	420b1d854b	Update textcat scorer threshold behavior (#11696 ) * Update textcat scorer threshold behavior For `textcat` (with exclusive classes) the scorer should always use a threshold of 0.0 because there should be one predicted label per doc and the numeric score for that particular label should not matter. * Rename to test_textcat_multilabel_threshold * Remove all uses of threshold for multi_label=False * Update Scorer.score_cats API docs * Add tests for score_cats with thresholds * Update textcat API docs * Fix types * Convert threshold back to float * Fix threshold type in docstring * Improve formatting in Scorer API docs	2022-11-02 15:35:04 +01:00
Adriane Boyd	f7edd84b44	Switch CI to Python 3.11.0 (#11737 )	2022-11-02 13:42:20 +01:00
richardpaulhudson	bbf058029a	Intermediate state	2022-11-01 20:46:55 +01:00
richardpaulhudson	2552340fb8	Get rid of memory views	2022-11-01 14:05:35 +01:00
Aaron Zipp	d25f09468c	Spelling mistake in rule-based-matching.md (#11717 ) Changed retokenize to retokenizer	2022-10-31 13:27:12 +09:00
richardpaulhudson	749da9d348	Speed improvements	2022-10-28 14:42:42 +02:00
richardpaulhudson	217ff36559	Tests passing again after refactoring	2022-10-28 13:31:14 +02:00
Paul O'Leary McCann	d61e742960	Handle Docs with no entities in EntityLinker (#11640 ) * Handle docs with no entities If a whole batch contains no entities it won't make it to the model, but it's possible for individual Docs to have no entities. Before this commit, those Docs would cause an error when attempting to concatenate arrays because the dimensions didn't match. It turns out the process of preparing the Ragged at the end of the span maker forward was a little different from list2ragged, which just uses the flatten function directly. Letting list2ragged do the conversion avoids the dimension issue. This did not come up before because in NEL demo projects it's typical for data with no entities to be discarded before it reaches the NEL component. This includes a simple direct test that shows the issue and checks it's resolved. It doesn't check if there are any downstream changes, so a more complete test could be added. A full run was tested by adding an example with no entities to the Emerson sample project. * Add a blank instance to default training data in tests Rather than adding a specific test, since not failing on instances with no entities is basic functionality, it makes sense to add it to the default set. * Fix without modifying architecture If the architecture is modified this would have to be a new version, but this change isn't big enough to merit that.	2022-10-28 10:25:34 +02:00
richardpaulhudson	5d151b4abe	Correction	2022-10-27 21:05:22 +02:00
richardpaulhudson	13e417e8d1	Intermediate state	2022-10-27 20:59:30 +02:00
richardpaulhudson	c140bd6083	Correction	2022-10-27 18:19:19 +02:00

1 2 3 4 5 ...

15759 Commits