Commit Graph

15981 Commits

Author SHA1 Message Date
Adriane Boyd
b2c56a089e Update docstrings and docs 2023-06-05 12:55:41 +02:00
Adriane Boyd
d52d7d9c87 Remove debugging 2023-06-05 12:46:01 +02:00
Adriane Boyd
dac12fb684 Move settings to self.cfg, store min/max unset as None 2023-06-05 10:32:34 +02:00
Adriane Boyd
ce4d33e726 Add span_finder to quickstart template 2023-06-05 09:15:07 +02:00
Adriane Boyd
9c403f1f30 Merge remote-tracking branch 'upstream/master' into add-span-finder 2023-06-02 20:01:37 +02:00
Adriane Boyd
024679c17f Format 2023-06-02 19:59:45 +02:00
Adriane Boyd
bb62ee9450 Fix offset bug in set_annotations
* Ignore labels in span finder scorer
2023-06-02 19:59:45 +02:00
Adriane Boyd
f84b59d68a Add docs and unify default configs for spancat and span finder
* Add `allow_overlap=True` to span finder scorer
2023-06-02 19:59:45 +02:00
Basile Dura
c3c064ace4
fix: InitializableComponent type hints (#12692)
* fix: InitializableComponent type hints

* fix: avoid circular dependency

* style: clean imports in language.py

* style: use relative imports

Co-authored-by: Adriane Boyd <adrianeboyd@gmail.com>

* fix: apply black

---------

Co-authored-by: Adriane Boyd <adrianeboyd@gmail.com>
2023-06-02 14:29:52 +02:00
kadarakos
9372b22d32 move preset_spans_suggester test to spancat tests 2023-06-02 10:08:16 +00:00
kadarakos
3ec1cb5e30 Merge branch 'add-span-finder' of https://github.com/kadarakos/spaCy into add-span-finder 2023-06-02 10:01:19 +00:00
kadarakos
bd71b87342 remove question comment 2023-06-02 10:00:57 +00:00
Adriane Boyd
3abdca27e4
Apply suggestions from code review 2023-06-02 11:52:21 +02:00
kadarakos
a33c7e0144
Update spacy/tests/pipeline/test_span_finder.py
Co-authored-by: Adriane Boyd <adrianeboyd@gmail.com>
2023-06-02 11:13:25 +02:00
kadarakos
752b3066cf make it clear that the span_finder_suggester is more general (not specific to span_finder) 2023-06-02 09:11:53 +00:00
kadarakos
37c4ad5007 only test suggester and test result exhaustively 2023-06-02 08:54:45 +00:00
Adriane Boyd
c4112a1da3
Require that all SpanGroup spans are from the current doc (#12569)
* Require that all SpanGroup spans are from the current doc

The restriction on only adding spans from the current doc were already
implemented for all operations except for `SpanGroup.__init__`.

Initialize copied spans for `SpanGroup.copy` with `Doc.char_span` in
order to validate the character offsets and to make it possible to copy
spans between documents with differing tokenization. Currently there is
no validation that the document texts are identical, but the span char
offsets must be valid spans in the target doc, which prevents you from
ending up with completely invalid spans.

* Undo change in test_beam_overfitting_IO
2023-06-01 19:19:17 +02:00
kadarakos
658c4aee35 flaky test fix suggestion, hand set bias terms 2023-06-01 16:44:55 +00:00
kadarakos
56de1076a1
Update spacy/pipeline/span_finder.py
Co-authored-by: Adriane Boyd <adrianeboyd@gmail.com>
2023-06-01 17:40:25 +02:00
Isabel Zimmerman
05df59fd4a
[DOCS] add vetiver to spacy universe (#12557)
* add vetiver to spacy universe

* remove image

* update logo to render correctly in thumbnail

* apply Basil's suggestion

Co-authored-by: Basile Dura <bdura@users.noreply.github.com>

* refer to the same model

---------

Co-authored-by: Sofie Van Landeghem <svlandeg@users.noreply.github.com>
Co-authored-by: Basile Dura <bdura@users.noreply.github.com>
2023-06-01 17:11:18 +02:00
kadarakos
8c7c34d4f4 use the 'spans_key' variable name everywhere 2023-06-01 13:09:25 +00:00
kadarakos
fe964e7831 remove near duplicate reduntant method 2023-06-01 13:02:12 +00:00
kadarakos
09b5f61e7d remove comment 2023-06-01 12:20:45 +00:00
kadarakos
4c2f80cf17 typo 2023-06-01 12:18:17 +00:00
kadarakos
2a1cb13069 remove debug lines 2023-06-01 10:23:35 +00:00
kadarakos
af802257f2 black 2023-06-01 10:20:14 +00:00
kadarakos
6f750d0da6 only use a single spans_key like in spancat 2023-06-01 10:19:22 +00:00
kadarakos
90af16af76 failing overfit test 2023-05-31 17:30:56 +00:00
kadarakos
f599bd5a4d return correct variable 2023-05-31 17:30:17 +00:00
kadarakos
6e46ecfa2c handle misaligned tokenization 2023-05-31 16:56:01 +00:00
Adriane Boyd
c936db2faf
Address numpy 1.25 deprecations in test suite (#12684)
* Address upcoming numpy v1.25 deprecations in test suite

* Temporarily test most recent numpy prerelease in CI

* Revert "Temporarily test most recent numpy prerelease in CI"

This reverts commit d75a66e55e.
2023-05-31 17:23:07 +02:00
Adriane Boyd
9b7a59c325
Revert "CI: Disable fail-fast (#12658)" (#12676)
This reverts commit 1f088cbf4a.
2023-05-26 10:57:02 +02:00
Vinit Ravishankar
f0e0206b77
update universe for spacypdfreader (#12661) 2023-05-23 13:28:48 +02:00
Adriane Boyd
1f088cbf4a
CI: Disable fail-fast (#12658)
While the typing_extensions/pydantic `Literal` bugs are being sorted
out, disable fail-fast so the rest of the CI is available for
development purposes.
2023-05-23 10:48:06 +02:00
Basile Dura
6ea4155487
feat: add comparison operators in span.pyi (#12652)
* feat: add comparison operators in span.pyi

remove Cython-specific `__richcmp__`

* fix: comparison operators should be defined for any other object
2023-05-23 08:50:37 +02:00
Victoria
6930a6bf45
Add spaCy VSCode extension materials (#12592) 2023-05-19 14:38:53 +02:00
Basile Dura
95fd46b1dd
feat: add type hinting on SpanGroup.__iter__ (#12642) 2023-05-17 14:20:00 +02:00
Adriane Boyd
df083f91a5
Add Malay to website languages (#12643) 2023-05-17 13:13:43 +02:00
Sani
873c16a4df
Malay language support (#12602)
* add malay lang

* fix token len

* black format

* reformat conftest malay

* remove exceptions not exist in dbp

* format code
2023-05-17 12:45:21 +02:00
Lj Miranda
58779c24ef
Remove shorthand for output-file in spacy apply (#12636)
The output-file argument is positional, so can't use a shorthand like -o.
2023-05-17 12:36:29 +02:00
David Berenstein
83b6f488cb
universe: Update examples Adept Augementation (#12620)
* Update universe.json

* chore: changed readme example as suggested by Vincent Warmerdam (koaning)
2023-05-15 14:09:33 +02:00
Adriane Boyd
3dc445df8d
Fix new tags in docs for v3.5.x (#12629)
* Fix new tags in docs for v3.5.x

* Fix new tag
2023-05-15 12:06:58 +02:00
Basile Dura
2dd8825f09
docs: add comment on offset_x argument (#12630) 2023-05-15 11:42:47 +02:00
Basile Dura
f96b9e03df
build: bump typer version to accept >=0.3<0.10 (#12631) 2023-05-15 08:06:58 +02:00
Adriane Boyd
3637148c4d
Add scorer option to return per-component scores (#12540)
* Add scorer option to return per-component scores

Add `per_component` option to `Language.evaluate` and `Scorer.score` to
return scores keyed by `tokenizer` (hard-coded) or by component name.

Add option to `evaluate` CLI to score by component. Per-component scores
can only be saved to JSON.

* Update help text and messages
2023-05-12 15:36:54 +02:00
Kenneth Enevoldsen
88680a6eed
docs: remove invalid huggingface-hub push argument (#12624) 2023-05-12 09:40:28 +02:00
Adriane Boyd
b5af0fe836
Revert "Use Latin normalization for Serbian attrs (#12608)" (#12621)
This reverts commit 6f314f99c4.

We are reverting this until we can support this normalization more
consistently across vectors, training corpora, and lemmatizer data.
2023-05-11 11:54:16 +02:00
royashcenazi
3252f6b13f
Parsigs universe 3 (#12617)
* parsigs universe

* added model installation explanation in the description

* Update website/meta/universe.json

Co-authored-by: Basile Dura <bdura@users.noreply.github.com>

* added model installement instruction in the code example

* added biomedical category

---------

Co-authored-by: Basile Dura <bdura@users.noreply.github.com>
2023-05-10 13:49:51 +02:00
royashcenazi
a56ab98e3c
parsigs universe (#12616)
* parsigs universe

* added model installation explanation in the description

* Update website/meta/universe.json

Co-authored-by: Basile Dura <bdura@users.noreply.github.com>

* added model installement instruction in the code example

---------

Co-authored-by: Basile Dura <bdura@users.noreply.github.com>
2023-05-10 13:19:28 +02:00
David Berenstein
d11b549195
chore: added adept-augmentations to the spacy universe (#12609)
* chore: added adept-augmentations to the spacy universe

* Apply suggestions from code review

Co-authored-by: Basile Dura <bdura@users.noreply.github.com>

* Update universe.json

---------

Co-authored-by: Basile Dura <bdura@users.noreply.github.com>
2023-05-10 13:16:16 +02:00