spaCy

mirror of https://github.com/explosion/spaCy.git synced 2025-09-12 15:12:39 +03:00

Author	SHA1	Message	Date
Daniël de Kok	5e297aa20e	Add `TrainablePipe.{distill,get_teacher_student_loss}` (#12016 ) * Add `TrainablePipe.{distill,get_teacher_student_loss}` This change adds two methods: - `TrainablePipe::distill` which performs a training step of a student pipe on a teacher pipe, giving a batch of `Doc`s. - `TrainablePipe::get_teacher_student_loss` computes the loss of a student relative to the teacher. The `distill` or `get_teacher_student_loss` methods are also implemented in the tagger, edit tree lemmatizer, and parser pipes, to enable distillation in those pipes and as an example for other pipes. * Fix stray `Beam` import * Fix incorrect import * Apply suggestions from code review Co-authored-by: Sofie Van Landeghem <svlandeg@users.noreply.github.com> * Apply suggestions from code review Co-authored-by: Sofie Van Landeghem <svlandeg@users.noreply.github.com> * TrainablePipe.distill: use `Iterable[Example]` * Add Pipe.is_distillable method * Add `validate_distillation_examples` This first calls `validate_examples` and then checks that the student/teacher tokens are the same. * Update distill documentation * Add distill documentation for all pipes that support distillation * Fix incorrect identifier * Apply suggestions from code review Co-authored-by: Sofie Van Landeghem <svlandeg@users.noreply.github.com> * Add comment to explain `is_distillable` Co-authored-by: Sofie Van Landeghem <svlandeg@users.noreply.github.com>	2023-01-16 10:25:53 +01:00
Sofie Van Landeghem	c2f3e699ca	fix anchors (#12095 )	2023-01-13 11:14:58 +01:00
Albert Villanova del Moral	25373d8e8e	Fix required maximum version of typing-extensions (#12036 ) * Fix required maximum version of typing-extensions * Restrict to <4.5.0, sync minimum pin Co-authored-by: Adriane Boyd <adrianeboyd@gmail.com>	2023-01-13 10:44:02 +01:00
github-actions[bot]	9ef7d26032	Auto-format code with black (#12100 ) Co-authored-by: explosion-bot <explosion-bot@users.noreply.github.com>	2023-01-13 10:12:10 +01:00
Daniël de Kok	dda7331da3	Handle missing annotations in the edit tree lemmatizer (#12098 ) The losses/gradients of missing annotations were not correctly masked out. Fix this and check the masking in the partial data test.	2023-01-12 12:13:55 +01:00
Daniël de Kok	319eb508b5	Add a `spacy benchmark speed` subcommand (#11902 ) * Add a `spacy evaluate speed` subcommand This subcommand reports the mean batch performance of a model on a data set with a 95% confidence interval. For reliability, it first performs some warmup rounds. Then it will measure performance on batches with randomly shuffled documents. To avoid having too many spaCy commands, `speed` is a subcommand of `evaluate` and accuracy evaluation is moved to its own `evaluate accuracy` subcommand. * Fix import cycle * Restore `spacy evaluate`, make `spacy benchmark speed` an alias * Add documentation for `spacy benchmark` * CREATES -> PRINTS * WPS -> words/s * Disable formatting of benchmark speed arguments * Fail with an error message when trying to speed bench empty corpus * Make it clearer that `benchmark accuracy` is a replacement for `evaluate` * Fix docstring webpage reference * tests: check `evaluate` output against `benchmark accuracy`	2023-01-12 11:55:21 +01:00
Paul O'Leary McCann	8e558095a1	Clean up displacy port-related error messages, docs (#12089 ) * Clean up displacy port-related error messages, docs There were some issues in the error messages and docs in #11948. 1. the error messages didn't specify the port argument to displacy.serve correctly 2. the docs didn't mark the auto select argument as new This addresses those issues. * Update website/docs/api/top-level.md Co-authored-by: Raphael Mitsch <r.mitsch@outlook.com> * Apply prettier Co-authored-by: Raphael Mitsch <r.mitsch@outlook.com>	2023-01-12 14:54:09 +09:00
Sofie Van Landeghem	2c2e66e145	Merge pull request #12096 from svlandeg/copy_v4 Sync with latest from master	2023-01-11 20:46:33 +01:00
svlandeg	fc2723925b	update tests from master to follow v4 principles (2)	2023-01-11 19:04:06 +01:00
svlandeg	6ff5eb256c	update tests from master to follow v4 principles	2023-01-11 18:57:50 +01:00
svlandeg	b2fd9490e3	Merge branch 'copy_master' into copy_v4	2023-01-11 18:40:55 +01:00
Sofie Van Landeghem	554df9ef20	Website migration from Gatsby to Next (#12058 ) * Rename all MDX file to `.mdx` * Lock current node version (#11885) * Apply Prettier (#11996) * Minor website fixes (#11974) [ci skip] * fix table * Migrate to Next WEB-17 (#12005) * Initial commit * Run `npx create-next-app@13 next-blog` * Install MDX packages Following: `77b5f79a4d/packages/next-mdx/readme.md` * Add MDX to Next * Allow Next to handle `.md` and `.mdx` files. * Add VSCode extension recommendation * Disabled TypeScript strict mode for now * Add prettier * Apply Prettier to all files * Make sure to use correct Node version * Add basic implementation for `MDXRemote` * Add experimental Rust MDX parser * Add `/public` * Add SASS support * Remove default pages and styling * Convert to module This allows to use `import/export` syntax * Add import for custom components * Add ability to load plugins * Extract function This will make the next commit easier to read * Allow to handle directories for page creation * Refactoring * Allow to parse subfolders for pages * Extract logic * Redirect `index.mdx` to parent directory * Disabled ESLint during builds * Disabled typescript during build * Remove Gatsby from `README.md` * Rephrase Docker part of `README.md` * Update project structure in `README.md` * Move and rename plugins * Update plugin for wrapping sections * Add dependencies for plugin * Use plugin * Rename wrapper type * Simplify unnessary adding of id to sections The slugified section ids are useless, because they can not be referenced anywhere anyway. The navigation only works if the section has the same id as the heading. * Add plugin for custom attributes on Markdown elements * Add plugin to readd support for tables * Add plugin to fix problem with wrapped images For more details see this issue: https://github.com/mdx-js/mdx/issues/1798 * Add necessary meta data to pages * Install necessary dependencies * Remove outdated MDX handling * Remove reliance on `InlineList` * Use existing Remark components * Remove unallowed heading Before `h1` components where not overwritten and would never have worked and they aren't used anywhere either. * Add missing components to MDX * Add correct styling * Fix broken list * Fix broken CSS classes * Implement layout * Fix links * Fix broken images * Fix pattern image * Fix heading attributes * Rename heading attribute `new` was causing some weird issue, so renaming it to `version` * Update comment syntax in MDX * Merge imports * Fix markdown rendering inside components * Add model pages * Simplify anchors * Fix default value for theme * Add Universe index page * Add Universe categories * Add Universe projects * Fix Next problem with copy Next complains when the server renders something different then the client, therfor we move the differing logic to `useEffect` * Fix improper component nesting Next doesn't allow block elements inside a `<p>` * Replace landing page MDX with page component * Remove inlined iframe content * Remove ability to inline HTML content in iFrames * Remove MDX imports * Fix problem with image inside link in MDX * Escape character for MDX * Fix unescaped characters in MDX * Fix headings with logo * Allow to export static HTML pages * Add prebuild script This command is automatically run by Next * Replace `svg-loader` with `react-inlinesvg` `svg-loader` is no longer maintained * Fix ESLint `react-hooks/exhaustive-deps` * Fix dropdowns * Change code language from `cli` to `bash` * Remove unnessary language `none` * Fix invalid code language `markdown_` with an underscore was used to basically turn of syntax highlighting, but using unknown languages know throws an error. * Enable code blocks plugin * Readd `InlineCode` component MDX2 removed the `inlineCode` component > The special component name `inlineCode` was removed, we recommend to use `pre` for the block version of code, and code for both the block and inline versions Source: https://mdxjs.com/migrating/v2/#update-mdx-content * Remove unused code * Extract function to own file * Fix code syntax highlighting * Update syntax for code block meta data * Remove unused prop * Fix internal link recognition There is a problem with regex between Node and browser, and since Next runs the component on both, this create an error. `Prop `rel` did not match. Server: "null" Client: "noopener nofollow noreferrer"` This simplifies the implementation and fixes the above error. * Replace `react-helmet` with `next/head` * Fix `className` problem for JSX component * Fix broken bold markdown * Convert file to `.mjs` to be used by Node process * Add plugin to replace strings * Fix custom table row styling * Fix problem with `span` inside inline `code` React doesn't allow a `span` inside an inline `code` element and throws an error in dev mode. * Add `_document` to be able to customize `<html>` and `<body>` * Add `lang="en"` * Store Netlify settings in file This way we don't need to update via Netlify UI, which can be tricky if changing build settings. * Add sitemap * Add Smartypants * Add PWA support * Add `manifest.webmanifest` * Fix bug with anchor links after reloading There was no need for the previous implementation, since the browser handles this nativly. Additional the manual scrolling into view was actually broken, because the heading would disappear behind the menu bar. * Rename custom event I was googeling for ages to find out what kind of event `inview` is, only to figure out it was a custom event with a name that sounds pretty much like a native one. 🫠 * Fix missing comment syntax highlighting * Refactor Quickstart component The previous implementation was hidding the irrelevant lines via data-props and dynamically generated CSS. This created problems with Next and was also hard to follow. CSS was used to do what React is supposed to handle. The new implementation simplfy filters the list of children (React elements) via their props. * Fix syntax highlighting for Training Quickstart * Unify code rendering * Improve error logging in Juniper * Fix Juniper component * Automatically generate "Read Next" link * Add Plausible * Use recent DocSearch component and adjust styling * Fix images * Turn of image optimization > Image Optimization using Next.js' default loader is not compatible with `next export`. We currently deploy to Netlify via `next export` * Dont build pages starting with `_` * Remove unused files * Add Next plugin to Netlify * Fix button layout MDX automatically adds `p` tags around text on a new line and Prettier wants to put the text on a new line. Hacking with JSX string. * Add 404 page * Apply Prettier * Update Prettier for `package.json` Next sometimes wants to patch `package-lock.json`. The old Prettier setting indended with 4 spaces, but Next always indends with 2 spaces. Since `npm install` automatically uses the indendation from `package.json` for `package-lock.json` and to avoid the format switching back and forth, both files are now set to 2 spaces. * Apply Next patch to `package-lock.json` When starting the dev server Next would warn `warn - Found lockfile missing swc dependencies, patching...` and update the `package-lock.json`. These are the patched changes. * fix link Co-authored-by: Sofie Van Landeghem <svlandeg@users.noreply.github.com> * small backslash fixes * adjust to new style Co-authored-by: Marcus Blättermann <marcus@essenmitsosse.de>	2023-01-11 17:30:07 +01:00
Adriane Boyd	e0168ccce9	Allow spacy-transformers v1.2.x in transformers extra (#12092 )	2023-01-11 13:54:58 +01:00
Adriane Boyd	9e0322de1a	Restore v2 token_acc score implementation (#12073 ) In the v3 scorer refactoring, `token_acc` was implemented incorrectly. It should use `precision` instead of `fscore` for the measure of correctly aligned tokens / number of predicted tokens. Fix the docs to reflect that the measure uses the number of predicted tokens rather than the number of gold tokens.	2023-01-11 08:01:47 +01:00
Kevin Humphreys	19650ebb52	Enable fuzzy text matching in Matcher (#11359 ) * enable fuzzy matching * add fuzzy param to EntityMatcher * include rapidfuzz_capi not yet used * fix type * add FUZZY predicate * add fuzzy attribute list * fix type properly * tidying * remove unnecessary dependency * handle fuzzy sets * simplify fuzzy sets * case fix * switch to FUZZYn predicates use Levenshtein distance. remove fuzzy param. remove rapidfuzz_capi. * revert changes added for fuzzy param * switch to polyleven (Python package) * enable fuzzy matching * add fuzzy param to EntityMatcher * include rapidfuzz_capi not yet used * fix type * add FUZZY predicate * add fuzzy attribute list * fix type properly * tidying * remove unnecessary dependency * handle fuzzy sets * simplify fuzzy sets * case fix * switch to FUZZYn predicates use Levenshtein distance. remove fuzzy param. remove rapidfuzz_capi. * revert changes added for fuzzy param * switch to polyleven (Python package) * fuzzy match only on oov tokens * remove polyleven * exclude whitespace tokens * don't allow more edits than characters * fix min distance * reinstate FUZZY operator with length-based distance function * handle sets inside regex operator * remove is_oov check * attempt build fix no mypy failure locally * re-attempt build fix * don't overwrite fuzzy param value * move fuzzy_match to its own Python module to allow patching * move fuzzy_match back inside Matcher simplify logic and add tests * Format tests * Parametrize fuzzyn tests * Parametrize and merge fuzzy+set tests * Format * Move fuzzy_match to a standalone method * Change regex kwarg type to bool * Add types for fuzzy_match - Refactor variable names - Add test for symmetrical behavior * Parametrize fuzzyn+set tests * Minor refactoring for fuzz/fuzzy * Make fuzzy_match a Matcher kwarg * Update type for _default_fuzzy_match * don't overwrite function param * Rename to fuzzy_compare * Update fuzzy_compare default argument declarations * allow fuzzy_compare override from EntityRuler * define new Matcher keyword arg * fix type definition * Implement fuzzy_compare config option for EntityRuler and SpanRuler * Rename _default_fuzzy_compare to fuzzy_compare, remove from reexported objects * Use simpler fuzzy_compare algorithm * Update types * Increase minimum to 2 in fuzzy_compare to allow one transposition * Fix predicate keys and matching for SetPredicate with FUZZY and REGEX * Add FUZZY6..9 * Add initial docs * Increase default fuzzy to rounded 30% of pattern length * Update docs for fuzzy_compare in components * Update EntityRuler and SpanRuler API docs * Rename EntityRuler and SpanRuler setting to matcher_fuzzy_compare To having naming similar to `phrase_matcher_attr`, rename `fuzzy_compare` setting for `EntityRuler` and `SpanRuler` to `matcher_fuzzy_compare. Organize next to `phrase_matcher_attr` in docs. * Fix schema aliases Co-authored-by: Sofie Van Landeghem <svlandeg@users.noreply.github.com> * Fix typo Co-authored-by: Sofie Van Landeghem <svlandeg@users.noreply.github.com> * Add FUZZY6-9 operators and update tests * Parameterize test over greedy Co-authored-by: Sofie Van Landeghem <svlandeg@users.noreply.github.com> * Fix type for fuzzy_compare to remove Optional * Rename to spacy.levenshtein_compare.v1, move to spacy.matcher.levenshtein * Update docs following levenshtein_compare renaming Co-authored-by: Adriane Boyd <adrianeboyd@gmail.com> Co-authored-by: Sofie Van Landeghem <svlandeg@users.noreply.github.com>	2023-01-10 10:36:17 +01:00
Zhangrp	eb8bb35c13	improve ux for displacy when the serve port is in use (#11948 ) * check port in use and add itself * check port in use and add itself * Auto switch to nearest available port. * Use bind to check port instead of connect_ex. * Reformat. * Add auto_select_port argument. * update docs for displacy.serve * Update spacy/errors.py Co-authored-by: Paul O'Leary McCann <polm@dampfkraft.com> * Update website/docs/api/top-level.md Co-authored-by: Paul O'Leary McCann <polm@dampfkraft.com> * Update spacy/errors.py Co-authored-by: Paul O'Leary McCann <polm@dampfkraft.com> * Add test using multiprocessing * fix argument name * Increase sleep times Want to rule this out as a cause of test failure * Don't terminate a process that isn't alive * Refactor port finding logic This moves all the port logic into its own util function, which can be tested without having to background a server directly. * Use with for the server This ensures the server is closed correctly. * Pass in the host when checking port availability * Shorten argument name * Update error codes following merge * Add types for arguments, specify docstrings. * Add typing for arguments with default value. * Update docstring to match spaCy format. * Update docstring to match spaCy format. * Fix docs Arg name changed from `auto_select_port` to just `auto_select`. * Revert "Fix docs" This reverts commit `356966fe84`. Co-authored-by: zhiiw <1302593554@qq.com> Co-authored-by: Paul O'Leary McCann <polm@dampfkraft.com> Co-authored-by: Raphael Mitsch <r.mitsch@outlook.com>	2023-01-10 15:52:57 +09:00
Madeesh Kannan	a231bf65af	Pass `step=0` to `Schedule` class to yield initial learning rate (#12078 )	2023-01-09 20:15:02 +01:00
Sofie Van Landeghem	6d03b04901	Improve score_cats for use with multiple textcat components (#11820 ) * add test for running evaluate on an nlp pipeline with two distinct textcat components * cleanup * merge dicts instead of overwrite * don't add more labels to the given set * Revert "merge dicts instead of overwrite" This reverts commit `89bee0ed77`. * Switch tests to separate scorer keys rather than merged dicts * Revert unrelated edits * Switch textcat scorers to v2 * formatting Co-authored-by: Adriane Boyd <adrianeboyd@gmail.com>	2023-01-09 11:43:48 +01:00
Madeesh Kannan	f1dcdefc8a	Add version tag to `before_update` config key (#12059 )	2023-01-05 11:46:04 +01:00
Sofie Van Landeghem	7f6c638c3a	fix processing of "auto" in convert (#12050 ) * fix processing of "auto" in walk_directory * add check for None * move AUTO check to convert and fix verification of args * add specific CLI test with CliRunner * cleanup * more cleanup * update docstring	2023-01-05 10:21:00 +01:00
Paul O'Leary McCann	dbd829f0ed	Fix inconsistency in displaCy docs about page option (#12047 ) * Fix inconsistency in displaCy docs about page option The `page` option, which wraps the output SVG in HTML, is true by default for `serve` but not for `render`. The `render` docs were wrong though, so this updates them. * Update the same statement in more docs A few renderers used the same language	2023-01-04 12:51:40 +09:00
Tetsuo Kiso	b510fbd0aa	Delete unused imports for StringStore (#12040 )	2023-01-03 17:43:09 +01:00
Sofie Van Landeghem	326b541312	Merge pull request #12049 from svlandeg/copy_v4 Sync v4 with latest from master	2023-01-03 16:43:54 +01:00
svlandeg	6852adc8b7	Merge branch 'copy_master' into copy_v4	2023-01-03 13:34:05 +01:00
Wannaphong Phatthiyaphaibun	31c1beba78	Add spacy-pythainlp (#12038 ) * Add spacy-pythainlp * Move submission to right section * Minor cleanup * Remove extra list call * Update universe.json Co-authored-by: Paul O'Leary McCann <polm@dampfkraft.com>	2023-01-03 17:03:59 +09:00
github-actions[bot]	abb0ab109d	Auto-format code with black (#12035 ) Co-authored-by: explosion-bot <explosion-bot@users.noreply.github.com>	2023-01-02 11:59:57 +01:00
Adriane Boyd	ef9e504eac	Rename modified textcat scorer to v2 (#11971 ) As a follow-up to #11696, rename the modified scorer to v2 and move the v1 scorer to `spacy-legacy`.	2022-12-29 14:01:08 +01:00
Daniël de Kok	20b63943f5	Adjust to new `Schedule` class and pass scores to `Optimizer` (#12008 ) * Adjust to new `Schedule` class and pass scores to `Optimizer` Requires https://github.com/explosion/thinc/pull/804 * Bump minimum Thinc requirement to 9.0.0.dev1	2022-12-29 08:03:24 +01:00
kadarakos	933b54ac79	typo fix (#11995 )	2022-12-26 13:26:35 +01:00
Madeesh Kannan	aa2b471a6e	New console logger with expanded progress tracking (#11972 ) * Add `ConsoleLogger.v3` This addition expands the progress bar feature to count up the training/distillation steps to either the next evaluation pass or the maximum number of steps. * Rename progress bar types * Add defaults to docs Minor fixes * Move comment * Minor punctuation fixes * Explicitly check for `None` when validating progress bar type Co-authored-by: Paul O'Leary McCann <polm@dampfkraft.com>	2022-12-23 15:21:44 +01:00
github-actions[bot]	90896504a5	Auto-format code with black (#12019 ) Co-authored-by: explosion-bot <explosion-bot@users.noreply.github.com>	2022-12-23 12:44:07 +01:00
Daniël de Kok	d30ba9b7b8	Merge pull request #12015 from danieldk/chore/v4-merge-master-20221222 Merge master into v4	2022-12-22 11:22:33 +01:00
Adriane Boyd	64d2d27c5d	Add classifier for python 3.11 (#12013 )	2022-12-22 10:53:16 +01:00
Daniël de Kok	2f08deea2a	Fix fallout from a previous merge	2022-12-22 10:23:31 +01:00
Daniël de Kok	207565a788	Merge remote-tracking branch 'upstream/master' into chore/v4-merge-master-20221222	2022-12-22 10:08:54 +01:00
Raphael Mitsch	eef3d950b4	Fix `SpanGroup` and `Span` typing (#12009 ) * Correct Span.label, Span.kb_id types. Fix SpanGroup.__iter__(). * Extend test. * Rename test. Fix typo. * Add comment. * Fix types for Span.label, Span.kb_id, Span.char_span(). * Update spacy/tests/doc/test_span_group.py Co-authored-by: Adriane Boyd <adrianeboyd@gmail.com> * Update docs. * Fix typo. * Update spacy/tokens/span_group.pyx Co-authored-by: Adriane Boyd <adrianeboyd@gmail.com> Co-authored-by: Adriane Boyd <adrianeboyd@gmail.com>	2022-12-21 18:54:27 +01:00
kadarakos	c223cd7a86	Add apply CLI (#11376 ) * annotate cli first try * add batch-size and n_process * rename to apply * typing fix * handle file suffixes * walk directories * support jsonl * typing fix * remove debug * make suffix optional for walk * revert unrelated * don't warn but raise * better error message * minor touch up * Update spacy/tests/test_cli.py Co-authored-by: Adriane Boyd <adrianeboyd@gmail.com> * Update spacy/cli/apply.py Co-authored-by: Sofie Van Landeghem <svlandeg@users.noreply.github.com> * Update spacy/cli/apply.py Co-authored-by: Sofie Van Landeghem <svlandeg@users.noreply.github.com> * update tests and bugfix * add force_overwrite * typo * fix adding .spacy suffix * Update spacy/cli/apply.py Co-authored-by: Sofie Van Landeghem <svlandeg@users.noreply.github.com> * Update spacy/cli/apply.py Co-authored-by: Sofie Van Landeghem <svlandeg@users.noreply.github.com> * Update spacy/cli/apply.py Co-authored-by: Sofie Van Landeghem <svlandeg@users.noreply.github.com> * store user data and rename cmd arg * include test for user attr * rename cmd arg * better help message * documentation * prettier * black * link fix * Update spacy/cli/apply.py Co-authored-by: Paul O'Leary McCann <polm@dampfkraft.com> * Update website/docs/api/cli.md Co-authored-by: Paul O'Leary McCann <polm@dampfkraft.com> * Update website/docs/api/cli.md Co-authored-by: Paul O'Leary McCann <polm@dampfkraft.com> * Update website/docs/api/cli.md Co-authored-by: Paul O'Leary McCann <polm@dampfkraft.com> * addressing reviews * dont quit but warn * prettier Co-authored-by: Adriane Boyd <adrianeboyd@gmail.com> Co-authored-by: Sofie Van Landeghem <svlandeg@users.noreply.github.com> Co-authored-by: Paul O'Leary McCann <polm@dampfkraft.com>	2022-12-20 17:11:33 +01:00
Jos Polfliet	18ffe5bbd6	Update stop_words.py (#11997 ) fix typo in "aangaande"	2022-12-19 16:17:49 +01:00
cfuerbachersparks	3a2b655a29	Update lexeme.md (#11994 ) Change suffix_ string to end	2022-12-19 10:33:38 +01:00
Daniël de Kok	f9308aae13	Fix v4 branch to build against Thinc v9 (#11921 ) * Move `thinc.extra.search` to `spacy.pipeline._parser_internals` Backport of: https://github.com/explosion/spaCy/pull/11317 Co-authored-by: Madeesh Kannan <shadeMe@users.noreply.github.com> * Replace references to `thinc.backends.linalg` with `CBlas` Backport of: https://github.com/explosion/spaCy/pull/11292 Co-authored-by: Madeesh Kannan <shadeMe@users.noreply.github.com> * Use cross entropy from `thinc.legacy` * Require thinc>=9.0.0.dev0,<9.1.0 Co-authored-by: Madeesh Kannan <shadeMe@users.noreply.github.com>	2022-12-17 14:32:19 +01:00
Adriane Boyd	c9d9d6847f	Update build constraints for python 3.11 (#11981 )	2022-12-15 10:55:01 +01:00
Adriane Boyd	e5c7f3b077	CI: Install thinc-apple-ops through extra (#11963 )	2022-12-12 10:13:10 +01:00
Edward	ca75190a3d	Custom extensions for spans with equal boundaries (#11429 ) * Init * Fix return type for mypy * adjust types and improve setting new attributes * Add underscore changes to json conversion * Add test and underscore changes to from_docs * add underscore changes and test to span.to_doc * update return values Co-authored-by: Sofie Van Landeghem <svlandeg@users.noreply.github.com> * Add types to function Co-authored-by: Sofie Van Landeghem <svlandeg@users.noreply.github.com> * adjust formatting Co-authored-by: Sofie Van Landeghem <svlandeg@users.noreply.github.com> * shorten return type Co-authored-by: Sofie Van Landeghem <svlandeg@users.noreply.github.com> * add helper function to improve readability * Improve code and add comments * rerun azure tests * Fix tests for json conversion Co-authored-by: Sofie Van Landeghem <svlandeg@users.noreply.github.com>	2022-12-12 08:55:53 +01:00
Adriane Boyd	0591e67265	Cast to uint64 for all array-based doc representations (#11933 ) * Convert all individual values explicitly to uint64 for array-based doc representations * Temporarily test with latest numpy v1.24.0rc * Remove unnecessary conversion from attr_t * Reduce number of individual casts * Convert specifically from int32 to uint64 * Revert "Temporarily test with latest numpy v1.24.0rc" This reverts commit `eb0e3c5006`. * Also use int32 in tests	2022-12-12 08:45:35 +01:00
Adriane Boyd	8c291ace0c	Extend to wasabi v1.1 (#11945 ) * Extend to wasabi v1.1 * Temporarily run mypy and tests with newest wasabi * Temporarily skip check requirements test * Revert "Temporarily skip check requirements test" This reverts commit `44f4ce20a8`. * Revert "Temporarily run mypy and tests with newest wasabi" This reverts commit `e677a2257c`.	2022-12-12 08:38:36 +01:00
github-actions[bot]	f22fc7a113	Auto-format code with black (#11955 ) Co-authored-by: explosion-bot <explosion-bot@users.noreply.github.com>	2022-12-09 10:15:52 +01:00
Madeesh Kannan	f5aabaf7d6	Remove unused, experimental multi-task components (#11919 ) * Remove experimental multi-task components These are incomplete implementations and are not usable in their current state. * Remove orphaned error message * Switch ubuntu-latest to ubuntu-20.04 in main tests (#11928) * Switch ubuntu-latest to ubuntu-20.04 in main tests * Only use 20.04 for 3.6 * Revert "Switch ubuntu-latest to ubuntu-20.04 in main tests (#11928)" This reverts commit `77c0fd7b17`. Co-authored-by: Paul O'Leary McCann <polm@dampfkraft.com>	2022-12-08 13:24:45 +01:00
Paul O'Leary McCann	d60997febb	Remove old model shortcuts (#11916 ) * Remove old model shortcuts * Remove error, docs warnings about shortcuts * Fix import in util Accidentally deleted the whole import and not just the old part... * Change universe example to v3 style * Switch ubuntu-latest to ubuntu-20.04 in main tests (#11928) * Switch ubuntu-latest to ubuntu-20.04 in main tests * Only use 20.04 for 3.6 * Update some model loading in Universe * Add v2 tag to neuralcoref * Use the spacy-version feature instead of a v2 tag Co-authored-by: svlandeg <svlandeg@github.com>	2022-12-08 11:45:52 +01:00
Paul O'Leary McCann	6b9af38eeb	Remove all references to "begin_training" (#11943 ) When v3 was released, `begin_training` was renamed to `initialize`. There were warnings in the code and docs about that. This PR removes them.	2022-12-08 11:43:52 +01:00
vincent d warmerdam	6d2ca1ab3a	Update custom solutions links (#11903 ) * Update custom solutions Will now point to https://explosion.ai/custom-solutions * added-sidebar * added-analysis-to-readme * update-landing-page	2022-12-07 16:02:09 +01:00

... 4 5 6 7 8 ...

16084 Commits