Commit Graph

15726 Commits

Author SHA1 Message Date
richardpaulhudson
28a93fd3e3 Another correction 2022-11-04 12:44:22 +01:00
richardpaulhudson
8d703963d3 Correct error 2022-11-04 12:40:03 +01:00
richardpaulhudson
f97d6e6826 Updated example config 2022-11-04 12:36:14 +01:00
richardpaulhudson
dcfc810033 Remove extraneous import 2022-11-04 11:31:18 +01:00
richardpaulhudson
750628a623 Fix mypy problem 2022-11-04 11:00:33 +01:00
richardpaulhudson
f0dc60691a Switch to 64-bit hashes 2022-11-04 10:17:25 +01:00
richardpaulhudson
7f1873ad81 Everything working after refactoring 2022-11-04 09:33:06 +01:00
richardpaulhudson
5d210a0f3b Tidy up code 2022-11-03 21:26:47 +01:00
richardpaulhudson
aaaed55459 Save end_search_idx in variable 2022-11-03 21:06:37 +01:00
richard@explosion.ai
5d32dd6246 Intermediate state 2022-11-03 20:54:07 +01:00
richard@explosion.ai
7db2770c05 Intermediate state 2022-11-03 15:23:50 +01:00
richard@explosion.ai
b462f85a73 Correction 2022-11-03 13:37:53 +01:00
richard@explosion.ai
c7a960f19e Performance improvement 2022-11-03 11:17:07 +01:00
richard@explosion.ai
deba504173 Add FNV1A conformity tests 2022-11-03 10:19:38 +01:00
richard@explosion.ai
557799358c Switch to FNV1A hashing 2022-11-02 20:04:43 +01:00
richard@explosion.ai
e7626f423a Generate Numpy array at end 2022-11-02 17:11:20 +01:00
richardpaulhudson
bbf058029a Intermediate state 2022-11-01 20:46:55 +01:00
richardpaulhudson
2552340fb8 Get rid of memory views 2022-11-01 14:05:35 +01:00
richardpaulhudson
749da9d348 Speed improvements 2022-10-28 14:42:42 +02:00
richardpaulhudson
217ff36559 Tests passing again after refactoring 2022-10-28 13:31:14 +02:00
richardpaulhudson
5d151b4abe Correction 2022-10-27 21:05:22 +02:00
richardpaulhudson
13e417e8d1 Intermediate state 2022-10-27 20:59:30 +02:00
richardpaulhudson
c140bd6083 Correction 2022-10-27 18:19:19 +02:00
richardpaulhudson
a1b8697aab Changes after review discussion — intermed. state 2022-10-27 18:03:25 +02:00
richardpaulhudson
7d8258bec8 Correct documentation 2022-10-21 14:35:40 +02:00
richardpaulhudson
100d66a052 Fix error codes 2022-10-21 12:48:03 +02:00
Richard Hudson
34e8bc620d
Merge branch 'master' into feature/etl 2022-10-21 12:46:02 +02:00
richardpaulhudson
42b7b8d509 Major refactoring 2022-10-21 12:01:24 +02:00
github-actions[bot]
84d9cb6b38
Auto-format code with black (#11687)
Co-authored-by: explosion-bot <explosion-bot@users.noreply.github.com>
2022-10-21 11:54:17 +02:00
richardpaulhudson
f7d9942e7c Intermediate state 2022-10-20 21:48:53 +02:00
Adriane Boyd
fb280001cc
Merge pull request #11678 from adrianeboyd/chore/update-develop-from-master-v3.5
Update develop from master before v3.5
2022-10-20 15:45:19 +02:00
Adriane Boyd
6c380d4fc6 Merge remote-tracking branch 'upstream/master' into chore/update-develop-from-master-v3.5 2022-10-20 13:45:17 +02:00
Adriane Boyd
7e56701057 Merge remote-tracking branch 'upstream/master' into chore/update-develop-from-master-v3.5 2022-10-20 13:38:49 +02:00
Cellan Hall
b69d249a22
Adding spacy-cleaner to the spaCy universe (#11674)
* added spacy-cleaner to the spaCy universe

* Move data to righ section of universe.json

* Cleanup

- fix typo ("replacers")
- spaCy doesn't need to be marked as code
- lemma of "Hello" is lower case

Co-authored-by: Paul O'Leary McCann <polm@dampfkraft.com>
2022-10-20 20:38:29 +09:00
Paul O'Leary McCann
bf83f6872a
Add detailed example of env dict usage (#11677)
* Add detailed example of env dict usage

* Mark code blocks as yaml
2022-10-20 20:35:03 +09:00
richardpaulhudson
2707d30ce0 Intermediate state 2022-10-19 23:20:11 +02:00
Adriane Boyd
3d0e895363
Set version to v3.4.2 (#11672) 2022-10-19 17:33:55 +02:00
Edward
d66ccb8eb0
Fix multiple entries per custom extension in doc json (#11551)
* Fix multiple extensions and character offset

* Rename token_start/end to start/end

* Refactor Doc.from_json based on review

* Iterate over user_data items

* Only add non-empty underscore entries

Co-authored-by: Adriane Boyd <adrianeboyd@gmail.com>
2022-10-19 15:52:47 +02:00
Adriane Boyd
a1eacaa8db
Add python 3.11.0rc2 to CI (#11667) 2022-10-18 14:36:06 +02:00
Paul O'Leary McCann
858565a567
Fix issues with DVC commands (#11592)
* Fix flag handling in dvc

Prior to this commit, if a flag (--verbose or --quiet) was passed to
DVC, it would be added to the end of the generated dvc command line.
This would result in the command being interpreted as part of the actual
command to run, rather than an argument to dvc. This would result in
command lines like:

    spacy project run preprocess --verbose

That would fail with an error that there's no such directory as
`--verbose`.

This change puts the flags at the front of the dvc command so that they
are interpreted correctly. It removes the `run_dvc_commands` function,
which had been reduced to just a for loop and wasn't used elsewhere.

A separate problem is that there's no way to specify the quiet behaviour
to dvc from the command line, though it's unclear if that's a bug.

* Add dvc quiet flag to docs

* Handle case in DVC where no commands are appropriate

If only have commands with no deps or outputs (admittedly unlikely), you
get a weird error about the dvc file not existing. This gives explicit
output instead.

* Add support for quiet flag

* Fix command execution

Commands are strings now because they're joined further up.
2022-10-18 15:11:39 +09:00
Sofie Van Landeghem
2ce6aadda2
update default configs to recent versions (#11618) 2022-10-17 12:10:03 +02:00
richardpaulhudson
356a341096 Add note 2022-10-14 20:29:22 +02:00
richardpaulhudson
fa5724e927 Remove unnecessary endianness stuff 2022-10-14 20:24:32 +02:00
richardpaulhudson
342433f09d Change to trigger CI 2022-10-14 18:22:04 +02:00
richardpaulhudson
07b6b53dae Correction 2022-10-14 17:22:48 +02:00
richardpaulhudson
c6cf5f2cb4 Fix indentation problem 2022-10-14 17:09:59 +02:00
richardpaulhudson
c116e11942 Add search char byte array feature 2022-10-14 17:03:52 +02:00
github-actions[bot]
ceb62352bf
Auto-format code with black (#11649)
Co-authored-by: explosion-bot <explosion-bot@users.noreply.github.com>
2022-10-14 18:04:55 +09:00
Adriane Boyd
6b5a3e7219
Extend to pydantic v1.10 (#11635)
* Update types in `spacy.schemas` for updated pydantic+mypy
2022-10-14 08:16:49 +02:00
richardpaulhudson
1e9176f9c5 Intermediate state 2022-10-13 20:50:25 +02:00