Commit Graph

15709 Commits

Author SHA1 Message Date
richardpaulhudson
2552340fb8 Get rid of memory views 2022-11-01 14:05:35 +01:00
richardpaulhudson
749da9d348 Speed improvements 2022-10-28 14:42:42 +02:00
richardpaulhudson
217ff36559 Tests passing again after refactoring 2022-10-28 13:31:14 +02:00
richardpaulhudson
5d151b4abe Correction 2022-10-27 21:05:22 +02:00
richardpaulhudson
13e417e8d1 Intermediate state 2022-10-27 20:59:30 +02:00
richardpaulhudson
c140bd6083 Correction 2022-10-27 18:19:19 +02:00
richardpaulhudson
a1b8697aab Changes after review discussion — intermed. state 2022-10-27 18:03:25 +02:00
richardpaulhudson
7d8258bec8 Correct documentation 2022-10-21 14:35:40 +02:00
richardpaulhudson
100d66a052 Fix error codes 2022-10-21 12:48:03 +02:00
Richard Hudson
34e8bc620d
Merge branch 'master' into feature/etl 2022-10-21 12:46:02 +02:00
richardpaulhudson
42b7b8d509 Major refactoring 2022-10-21 12:01:24 +02:00
github-actions[bot]
84d9cb6b38
Auto-format code with black (#11687)
Co-authored-by: explosion-bot <explosion-bot@users.noreply.github.com>
2022-10-21 11:54:17 +02:00
richardpaulhudson
f7d9942e7c Intermediate state 2022-10-20 21:48:53 +02:00
Adriane Boyd
fb280001cc
Merge pull request #11678 from adrianeboyd/chore/update-develop-from-master-v3.5
Update develop from master before v3.5
2022-10-20 15:45:19 +02:00
Adriane Boyd
6c380d4fc6 Merge remote-tracking branch 'upstream/master' into chore/update-develop-from-master-v3.5 2022-10-20 13:45:17 +02:00
Adriane Boyd
7e56701057 Merge remote-tracking branch 'upstream/master' into chore/update-develop-from-master-v3.5 2022-10-20 13:38:49 +02:00
Cellan Hall
b69d249a22
Adding spacy-cleaner to the spaCy universe (#11674)
* added spacy-cleaner to the spaCy universe

* Move data to righ section of universe.json

* Cleanup

- fix typo ("replacers")
- spaCy doesn't need to be marked as code
- lemma of "Hello" is lower case

Co-authored-by: Paul O'Leary McCann <polm@dampfkraft.com>
2022-10-20 20:38:29 +09:00
Paul O'Leary McCann
bf83f6872a
Add detailed example of env dict usage (#11677)
* Add detailed example of env dict usage

* Mark code blocks as yaml
2022-10-20 20:35:03 +09:00
richardpaulhudson
2707d30ce0 Intermediate state 2022-10-19 23:20:11 +02:00
Adriane Boyd
3d0e895363
Set version to v3.4.2 (#11672) 2022-10-19 17:33:55 +02:00
Edward
d66ccb8eb0
Fix multiple entries per custom extension in doc json (#11551)
* Fix multiple extensions and character offset

* Rename token_start/end to start/end

* Refactor Doc.from_json based on review

* Iterate over user_data items

* Only add non-empty underscore entries

Co-authored-by: Adriane Boyd <adrianeboyd@gmail.com>
2022-10-19 15:52:47 +02:00
Adriane Boyd
a1eacaa8db
Add python 3.11.0rc2 to CI (#11667) 2022-10-18 14:36:06 +02:00
Paul O'Leary McCann
858565a567
Fix issues with DVC commands (#11592)
* Fix flag handling in dvc

Prior to this commit, if a flag (--verbose or --quiet) was passed to
DVC, it would be added to the end of the generated dvc command line.
This would result in the command being interpreted as part of the actual
command to run, rather than an argument to dvc. This would result in
command lines like:

    spacy project run preprocess --verbose

That would fail with an error that there's no such directory as
`--verbose`.

This change puts the flags at the front of the dvc command so that they
are interpreted correctly. It removes the `run_dvc_commands` function,
which had been reduced to just a for loop and wasn't used elsewhere.

A separate problem is that there's no way to specify the quiet behaviour
to dvc from the command line, though it's unclear if that's a bug.

* Add dvc quiet flag to docs

* Handle case in DVC where no commands are appropriate

If only have commands with no deps or outputs (admittedly unlikely), you
get a weird error about the dvc file not existing. This gives explicit
output instead.

* Add support for quiet flag

* Fix command execution

Commands are strings now because they're joined further up.
2022-10-18 15:11:39 +09:00
Sofie Van Landeghem
2ce6aadda2
update default configs to recent versions (#11618) 2022-10-17 12:10:03 +02:00
richardpaulhudson
356a341096 Add note 2022-10-14 20:29:22 +02:00
richardpaulhudson
fa5724e927 Remove unnecessary endianness stuff 2022-10-14 20:24:32 +02:00
richardpaulhudson
342433f09d Change to trigger CI 2022-10-14 18:22:04 +02:00
richardpaulhudson
07b6b53dae Correction 2022-10-14 17:22:48 +02:00
richardpaulhudson
c6cf5f2cb4 Fix indentation problem 2022-10-14 17:09:59 +02:00
richardpaulhudson
c116e11942 Add search char byte array feature 2022-10-14 17:03:52 +02:00
github-actions[bot]
ceb62352bf
Auto-format code with black (#11649)
Co-authored-by: explosion-bot <explosion-bot@users.noreply.github.com>
2022-10-14 18:04:55 +09:00
Adriane Boyd
6b5a3e7219
Extend to pydantic v1.10 (#11635)
* Update types in `spacy.schemas` for updated pydantic+mypy
2022-10-14 08:16:49 +02:00
richardpaulhudson
1e9176f9c5 Intermediate state 2022-10-13 20:50:25 +02:00
richardpaulhudson
fc99b97e3c Merge branch 'feature/etl' of https://github.com/richardpaulhudson/spacy into feature/etl 2022-10-13 12:21:54 +02:00
richardpaulhudson
be363a7710 Intermediate state necessary to test equivalence 2022-10-13 12:20:56 +02:00
Sofie Van Landeghem
4d869fcc11
Small fixes to docstrings (#11610)
* add missing scorer arg to docstring

* fix class names in textcat_multilabel

* add missing scorer to docstrings
2022-10-12 15:17:40 +02:00
Adriane Boyd
fe06e037bc
Fix init for pymorphy2_lookup lemmatizer mode (#11631) 2022-10-12 12:18:39 +02:00
Paul O'Leary McCann
2e52479eec
Fix example code for spacy-wordnet (#11593)
* Fix example code for spacy-wordnet

It looks like in the most recent version, 0.1.0, it's no longer possible
to pass the lang parameter to the component separately. Doing so will
raise an error.

* Apply suggestions from code review

Co-authored-by: Sofie Van Landeghem <svlandeg@users.noreply.github.com>

* Cleanup

* More cleanup

Co-authored-by: Sofie Van Landeghem <svlandeg@users.noreply.github.com>
2022-10-11 16:45:05 +02:00
Sofie Van Landeghem
29649589fc
remove dtype (#11615) 2022-10-11 15:25:05 +02:00
Sofie Van Landeghem
ef74f8f5e4
Fix mypy error in edittree lemmatizer (#11612)
* cleanup imports

* try limiting Thinc to previous release

* remove Model specification

* fix code and revert Thinc constraint
2022-10-11 14:15:22 +02:00
Richard Hudson
92762e69b4
Merge branch 'master' into feature/etl 2022-10-06 17:04:54 +02:00
richardpaulhudson
f410c066f4 Documentation improvements 2022-10-06 15:40:51 +02:00
richardpaulhudson
761d5ab9c3 Update errors 2022-10-06 15:12:41 +02:00
richardpaulhudson
581f380c00 Python code and documentation 2022-10-06 15:10:27 +02:00
richardpaulhudson
06fe50a12d Corrections 2022-10-06 08:04:50 +02:00
richardpaulhudson
f2c73aa85d Corrections 2022-10-06 07:50:35 +02:00
richardpaulhudson
7d4e99425b Another temporary type:ignore 2022-10-05 19:30:10 +02:00
richardpaulhudson
2a6c1cf63c Add temporary #type:ignore s 2022-10-05 19:15:18 +02:00
richardpaulhudson
ed76c89968 Remove extra lines 2022-10-05 18:57:10 +02:00
richardpaulhudson
28da06780e Remove extra line 2022-10-05 18:56:15 +02:00