Commit Graph

15824 Commits

Author SHA1 Message Date
Sofie Van Landeghem
4d869fcc11
Small fixes to docstrings (#11610)
* add missing scorer arg to docstring

* fix class names in textcat_multilabel

* add missing scorer to docstrings
2022-10-12 15:17:40 +02:00
Adriane Boyd
fe06e037bc
Fix init for pymorphy2_lookup lemmatizer mode (#11631) 2022-10-12 12:18:39 +02:00
Paul O'Leary McCann
2e52479eec
Fix example code for spacy-wordnet (#11593)
* Fix example code for spacy-wordnet

It looks like in the most recent version, 0.1.0, it's no longer possible
to pass the lang parameter to the component separately. Doing so will
raise an error.

* Apply suggestions from code review

Co-authored-by: Sofie Van Landeghem <svlandeg@users.noreply.github.com>

* Cleanup

* More cleanup

Co-authored-by: Sofie Van Landeghem <svlandeg@users.noreply.github.com>
2022-10-11 16:45:05 +02:00
Sofie Van Landeghem
29649589fc
remove dtype (#11615) 2022-10-11 15:25:05 +02:00
Sofie Van Landeghem
ef74f8f5e4
Fix mypy error in edittree lemmatizer (#11612)
* cleanup imports

* try limiting Thinc to previous release

* remove Model specification

* fix code and revert Thinc constraint
2022-10-11 14:15:22 +02:00
Richard Hudson
92762e69b4
Merge branch 'master' into feature/etl 2022-10-06 17:04:54 +02:00
richardpaulhudson
f410c066f4 Documentation improvements 2022-10-06 15:40:51 +02:00
richardpaulhudson
761d5ab9c3 Update errors 2022-10-06 15:12:41 +02:00
richardpaulhudson
581f380c00 Python code and documentation 2022-10-06 15:10:27 +02:00
richardpaulhudson
06fe50a12d Corrections 2022-10-06 08:04:50 +02:00
richardpaulhudson
f2c73aa85d Corrections 2022-10-06 07:50:35 +02:00
richardpaulhudson
7d4e99425b Another temporary type:ignore 2022-10-05 19:30:10 +02:00
richardpaulhudson
2a6c1cf63c Add temporary #type:ignore s 2022-10-05 19:15:18 +02:00
richardpaulhudson
ed76c89968 Remove extra lines 2022-10-05 18:57:10 +02:00
richardpaulhudson
28da06780e Remove extra line 2022-10-05 18:56:15 +02:00
richardpaulhudson
cbe2010e48 Format with black 2022-10-05 18:54:26 +02:00
richardpaulhudson
523bb2ad0b Temporarily commented out code 2022-10-05 18:51:08 +02:00
richardpaulhudson
6bb8d26528 Improvements 2022-10-05 18:35:46 +02:00
richardpaulhudson
d6c77659dc New error 2022-10-05 14:18:07 +02:00
richardpaulhudson
f712e0bc4a Performance improvements 2022-10-05 14:17:28 +02:00
Adriane Boyd
8cd77dd54c
Sync flake8 version across requirements (#11580) 2022-10-04 11:23:04 +02:00
Sofie Van Landeghem
b187076a2d
fix docs (#11573) 2022-10-03 17:01:04 +02:00
Sofie Van Landeghem
3033babe98
Merge pull request #11571 from svlandeg/copy_develop
update develop with latest from master, incl CI fix
2022-10-03 14:05:51 +02:00
svlandeg
83425d4f6f Merge branch 'copy_master' into copy_develop 2022-10-03 13:06:31 +02:00
Sofie Van Landeghem
70e21dfcad
PR to test importlib-metadata (#11569)
* empty commit

* restrict importlib-metadata to lower than 5.0.0

* restrict importlib-metadata also for validate CI step

* set fixed version for CI

* try flake8 5.0.4 in CI validation step

* from importlib-metadata from requirements again
2022-10-03 13:04:03 +02:00
Paul O'Leary McCann
087cc74c6a
Remove mention of 1.7 from issue template (#11570)
It's rare to have anyone using v1 anymore, so this message is no longer
helpful.
2022-10-03 11:53:21 +02:00
Sofie Van Landeghem
bf6e43ab2f
Merge pull request #11563 from svlandeg/develop_copy
update develop with latest from master
2022-10-03 09:34:38 +02:00
richardpaulhudson
d296ae9d8e Intermediate state 2022-09-30 22:26:14 +02:00
svlandeg
9c8cdb403e Merge branch 'master_copy' into develop_copy 2022-09-30 15:40:26 +02:00
richardpaulhudson
da63b9448b Intermediate state 2022-09-29 22:09:18 +02:00
Gabriele Picco
ff9002b726
Add Zshot Spacy plugin (#11557)
* Add Zshot Spacy plugin

Add Zshot (Zero and Few shot named entity & relationships recognition) Spacy plugin

* Update website/meta/universe.json

Co-authored-by: Adriane Boyd <adrianeboyd@gmail.com>

* Update website/meta/universe.json

Co-authored-by: Adriane Boyd <adrianeboyd@gmail.com>

Co-authored-by: Adriane Boyd <adrianeboyd@gmail.com>
2022-09-29 17:34:44 +02:00
Sofie Van Landeghem
bcda8bc1e7
update mypy to latest version (#11546)
* update mypy and disable it for python 3.6

* ignoring mypy's type redefinition error
2022-09-29 14:24:40 +02:00
richardpaulhudson
644d6131af Intermediate state 2022-09-29 13:14:42 +02:00
Paul O'Leary McCann
ba63f57f81
Update docs to reflect Doc input to Language (#11555) 2022-09-29 18:50:29 +09:00
Adriane Boyd
6d7630c5d3
Allow overriding spacy_version in spacy package meta (#11552) 2022-09-29 10:44:06 +02:00
Peter Baumgartner
e794d4ae39
debug data Spancat Table Improvements (#11504)
* update

* fix format function

* pull out _format_number

* format with black
2022-09-28 17:16:05 +02:00
Raphael Mitsch
aea16719be
Simplify and clarify enable/disable behavior of spacy.load() (#11459)
* Change enable/disable behavior so that arguments take precedence over config options. Extend error message on conflict. Add warning message in case of overwriting config option with arguments.

* Fix tests in test_serialize_pipeline.py to reflect changes to handling of enable/disable.

* Fix type issue.

* Move comment.

* Move comment.

* Issue UserWarning instead of printing wasabi message. Adjust test.

* Added pytest.warns(UserWarning) for expected warning to fix tests.

* Update warning message.

* Move type handling out of fetch_pipes_status().

* Add global variable for default value. Use id() to determine whether used values are default value.

* Fix default value for disable.

* Rename DEFAULT_PIPE_STATUS to _DEFAULT_EMPTY_PIPES.
2022-09-27 14:22:36 +02:00
Taniguchi Yasufumi
9557b0fb01
Add spacy-partial-tagger to spaCy Universe (#11538) 2022-09-27 14:11:50 +02:00
Jacobo Myerston
3e8bc1272f
add punctuation to grc (#11426)
* add punctuation to grc

Add support for special editorial punctuation that is common in ancient Greek texts.  Ancient Greek texts, as found in digital and print form, have been largely edited by scholars. Restorations and improvements are normally marked with special characters that need to be handled properly by the tokenizer.

* add unit tests

* simplify regex

* move generic quotes to char classes

* rename unit test

* fix regex

Co-authored-by: Adriane Boyd <adrianeboyd@gmail.com>

Co-authored-by: svlandeg <svlandeg@github.com>
Co-authored-by: Sofie Van Landeghem <svlandeg@users.noreply.github.com>
Co-authored-by: Adriane Boyd <adrianeboyd@gmail.com>
2022-09-27 11:38:56 +02:00
Paul O'Leary McCann
a44b7d4622
Add experimental coref docs (#11291)
* Add experimental coref docs

* Docs cleanup

* Apply suggestions from code review

Co-authored-by: Sofie Van Landeghem <svlandeg@users.noreply.github.com>

* Apply changes from code review

* Fix prettier formatting

It seems a period after a number made this think it was a list?

* Update docs on examples for initialize

* Add docs for coref scorers

* Remove 3.4 notes from coref

There won't be a "new" tag until it's in core.

* Add docs for span cleaner

* Fix docs

* Fix docs to match spacy-experimental

These weren't properly updated when the code was moved out of spacy
core.

* More doc fixes

* Formatting

* Update architectures

* Fix links

* Fix another link

Co-authored-by: Sofie Van Landeghem <svlandeg@users.noreply.github.com>
Co-authored-by: svlandeg <svlandeg@github.com>
2022-09-27 18:11:23 +09:00
Adriane Boyd
877671e09a
Preserve missing entity annotation in augmenters (#11540)
Preserve both `-` and `O` annotation in augmenters rather than relying
on `Example.to_dict`'s default support for one option outside of labeled
entity spans.

This is intended as a temporary workaround for augmenters for v3.4.x.
The behavior of `Example` and related IOB utils could be improved in the
general case for v3.5.
2022-09-27 10:16:51 +02:00
Paul O'Leary McCann
936a5f0506
Fix English pipeline names in 3.4 release notes (#11542) 2022-09-27 08:25:24 +02:00
Richard Hudson
6f692a06d5
Remove side effects from Doc.__init__() (#11506)
* Remove side effects from Doc.__init__()

* Changes based on review comment

* Readd test

* Change interface of Doc.__init__()

* Simplify test

Co-authored-by: Adriane Boyd <adrianeboyd@gmail.com>

* Update doc.md

Co-authored-by: Adriane Boyd <adrianeboyd@gmail.com>
2022-09-26 15:58:21 +02:00
Basile Dura
f40d2fac29
fix: remove duplicate v3.2 (#11530) 2022-09-23 13:18:51 +02:00
richardpaulhudson
6f42d79c1e Intermediate state 2022-09-16 20:00:20 +02:00
Raphael Mitsch
af9b01ef97
Add dependency check to project step runs (#11226)
* Add dependency check to project step running.

* Fix dependency mismatch warning.

* Remove newline.

* Add types-setuptools to setup.cfg.

* Move types-setuptools to test requirements. Move warnings into _validate_requirements(). Handle file reading in project_run().

* Remove newline formatting for output of package conflicts.

* Show full version conflict message instead of just package name.

* Update spacy/cli/project/run.py

Co-authored-by: Adriane Boyd <adrianeboyd@gmail.com>

* Fix typo.

* Re-add rephrasing of message for conflicting packages. Remove requirements path redundancy.

* Update spacy/cli/project/run.py

Co-authored-by: Adriane Boyd <adrianeboyd@gmail.com>

* Update spacy/cli/project/run.py

Co-authored-by: Adriane Boyd <adrianeboyd@gmail.com>

* Print unified message for requirement conflicts and missing requirements.

* Update spacy/cli/project/run.py

Co-authored-by: Adriane Boyd <adrianeboyd@gmail.com>

* Fix warning message.

* Print conflict/missing messages individually.

* Print conflict/missing messages individually.

* Add check_requirements setting in project.yml to disable requirements check.

* Update website/docs/usage/projects.md

Co-authored-by: Adriane Boyd <adrianeboyd@gmail.com>

* Update website/docs/usage/projects.md

Co-authored-by: Adriane Boyd <adrianeboyd@gmail.com>

* Update description of project.yml structure in projects.md.

* Update website/docs/usage/projects.md

Co-authored-by: Sofie Van Landeghem <svlandeg@users.noreply.github.com>

* Prettify projects docs.

Co-authored-by: Adriane Boyd <adrianeboyd@gmail.com>
Co-authored-by: Sofie Van Landeghem <svlandeg@users.noreply.github.com>
2022-09-16 16:54:31 +02:00
richardpaulhudson
d575b9f8d4 Return 64-bit integers 2022-09-16 13:28:58 +02:00
github-actions[bot]
279358be63
Auto-format code with black (#11513)
Co-authored-by: explosion-bot <explosion-bot@users.noreply.github.com>
2022-09-16 11:50:19 +02:00
Sofie Van Landeghem
df0b815c23
more explicit Example constructor example (#11489)
* make constructor example for Example more explicit

* shorten example and add spaces
2022-09-16 09:26:33 +02:00
Sofie Van Landeghem
d5c8498f2f
disable mypy run for Python 3.10 (#11508) (#11511) 2022-09-15 17:41:25 +02:00