Adriane Boyd
ce4d33e726
Add span_finder to quickstart template
2023-06-05 09:15:07 +02:00
Adriane Boyd
9c403f1f30
Merge remote-tracking branch 'upstream/master' into add-span-finder
2023-06-02 20:01:37 +02:00
Adriane Boyd
024679c17f
Format
2023-06-02 19:59:45 +02:00
Adriane Boyd
bb62ee9450
Fix offset bug in set_annotations
...
* Ignore labels in span finder scorer
2023-06-02 19:59:45 +02:00
Adriane Boyd
f84b59d68a
Add docs and unify default configs for spancat and span finder
...
* Add `allow_overlap=True` to span finder scorer
2023-06-02 19:59:45 +02:00
Basile Dura
c3c064ace4
fix: InitializableComponent
type hints ( #12692 )
...
* fix: InitializableComponent type hints
* fix: avoid circular dependency
* style: clean imports in language.py
* style: use relative imports
Co-authored-by: Adriane Boyd <adrianeboyd@gmail.com>
* fix: apply black
---------
Co-authored-by: Adriane Boyd <adrianeboyd@gmail.com>
2023-06-02 14:29:52 +02:00
kadarakos
9372b22d32
move preset_spans_suggester test to spancat tests
2023-06-02 10:08:16 +00:00
kadarakos
3ec1cb5e30
Merge branch 'add-span-finder' of https://github.com/kadarakos/spaCy into add-span-finder
2023-06-02 10:01:19 +00:00
kadarakos
bd71b87342
remove question comment
2023-06-02 10:00:57 +00:00
Adriane Boyd
3abdca27e4
Apply suggestions from code review
2023-06-02 11:52:21 +02:00
kadarakos
a33c7e0144
Update spacy/tests/pipeline/test_span_finder.py
...
Co-authored-by: Adriane Boyd <adrianeboyd@gmail.com>
2023-06-02 11:13:25 +02:00
kadarakos
752b3066cf
make it clear that the span_finder_suggester is more general (not specific to span_finder)
2023-06-02 09:11:53 +00:00
kadarakos
37c4ad5007
only test suggester and test result exhaustively
2023-06-02 08:54:45 +00:00
Adriane Boyd
c4112a1da3
Require that all SpanGroup spans are from the current doc ( #12569 )
...
* Require that all SpanGroup spans are from the current doc
The restriction on only adding spans from the current doc were already
implemented for all operations except for `SpanGroup.__init__`.
Initialize copied spans for `SpanGroup.copy` with `Doc.char_span` in
order to validate the character offsets and to make it possible to copy
spans between documents with differing tokenization. Currently there is
no validation that the document texts are identical, but the span char
offsets must be valid spans in the target doc, which prevents you from
ending up with completely invalid spans.
* Undo change in test_beam_overfitting_IO
2023-06-01 19:19:17 +02:00
kadarakos
658c4aee35
flaky test fix suggestion, hand set bias terms
2023-06-01 16:44:55 +00:00
kadarakos
56de1076a1
Update spacy/pipeline/span_finder.py
...
Co-authored-by: Adriane Boyd <adrianeboyd@gmail.com>
2023-06-01 17:40:25 +02:00
Isabel Zimmerman
05df59fd4a
[DOCS] add vetiver to spacy universe ( #12557 )
...
* add vetiver to spacy universe
* remove image
* update logo to render correctly in thumbnail
* apply Basil's suggestion
Co-authored-by: Basile Dura <bdura@users.noreply.github.com>
* refer to the same model
---------
Co-authored-by: Sofie Van Landeghem <svlandeg@users.noreply.github.com>
Co-authored-by: Basile Dura <bdura@users.noreply.github.com>
2023-06-01 17:11:18 +02:00
kadarakos
8c7c34d4f4
use the 'spans_key' variable name everywhere
2023-06-01 13:09:25 +00:00
kadarakos
fe964e7831
remove near duplicate reduntant method
2023-06-01 13:02:12 +00:00
kadarakos
09b5f61e7d
remove comment
2023-06-01 12:20:45 +00:00
kadarakos
4c2f80cf17
typo
2023-06-01 12:18:17 +00:00
kadarakos
2a1cb13069
remove debug lines
2023-06-01 10:23:35 +00:00
kadarakos
af802257f2
black
2023-06-01 10:20:14 +00:00
kadarakos
6f750d0da6
only use a single spans_key like in spancat
2023-06-01 10:19:22 +00:00
kadarakos
90af16af76
failing overfit test
2023-05-31 17:30:56 +00:00
kadarakos
f599bd5a4d
return correct variable
2023-05-31 17:30:17 +00:00
kadarakos
6e46ecfa2c
handle misaligned tokenization
2023-05-31 16:56:01 +00:00
Adriane Boyd
c936db2faf
Address numpy 1.25 deprecations in test suite ( #12684 )
...
* Address upcoming numpy v1.25 deprecations in test suite
* Temporarily test most recent numpy prerelease in CI
* Revert "Temporarily test most recent numpy prerelease in CI"
This reverts commit d75a66e55e
.
2023-05-31 17:23:07 +02:00
Adriane Boyd
9b7a59c325
Revert "CI: Disable fail-fast ( #12658 )" ( #12676 )
...
This reverts commit 1f088cbf4a
.
2023-05-26 10:57:02 +02:00
Vinit Ravishankar
f0e0206b77
update universe for spacypdfreader ( #12661 )
2023-05-23 13:28:48 +02:00
Adriane Boyd
1f088cbf4a
CI: Disable fail-fast ( #12658 )
...
While the typing_extensions/pydantic `Literal` bugs are being sorted
out, disable fail-fast so the rest of the CI is available for
development purposes.
2023-05-23 10:48:06 +02:00
Basile Dura
6ea4155487
feat: add comparison operators in span.pyi
( #12652 )
...
* feat: add comparison operators in span.pyi
remove Cython-specific `__richcmp__`
* fix: comparison operators should be defined for any other object
2023-05-23 08:50:37 +02:00
Victoria
6930a6bf45
Add spaCy VSCode extension materials ( #12592 )
2023-05-19 14:38:53 +02:00
Basile Dura
95fd46b1dd
feat: add type hinting on SpanGroup.__iter__ ( #12642 )
2023-05-17 14:20:00 +02:00
Adriane Boyd
df083f91a5
Add Malay to website languages ( #12643 )
2023-05-17 13:13:43 +02:00
Sani
873c16a4df
Malay language support ( #12602 )
...
* add malay lang
* fix token len
* black format
* reformat conftest malay
* remove exceptions not exist in dbp
* format code
2023-05-17 12:45:21 +02:00
Lj Miranda
58779c24ef
Remove shorthand for output-file in spacy apply ( #12636 )
...
The output-file argument is positional, so can't use a shorthand like -o.
2023-05-17 12:36:29 +02:00
David Berenstein
83b6f488cb
universe: Update examples Adept Augementation ( #12620 )
...
* Update universe.json
* chore: changed readme example as suggested by Vincent Warmerdam (koaning)
2023-05-15 14:09:33 +02:00
Adriane Boyd
3dc445df8d
Fix new tags in docs for v3.5.x ( #12629 )
...
* Fix new tags in docs for v3.5.x
* Fix new tag
2023-05-15 12:06:58 +02:00
Basile Dura
2dd8825f09
docs: add comment on offset_x
argument ( #12630 )
2023-05-15 11:42:47 +02:00
Basile Dura
f96b9e03df
build: bump typer version to accept >=0.3<0.10 ( #12631 )
2023-05-15 08:06:58 +02:00
Adriane Boyd
3637148c4d
Add scorer option to return per-component scores ( #12540 )
...
* Add scorer option to return per-component scores
Add `per_component` option to `Language.evaluate` and `Scorer.score` to
return scores keyed by `tokenizer` (hard-coded) or by component name.
Add option to `evaluate` CLI to score by component. Per-component scores
can only be saved to JSON.
* Update help text and messages
2023-05-12 15:36:54 +02:00
Kenneth Enevoldsen
88680a6eed
docs: remove invalid huggingface-hub push argument ( #12624 )
2023-05-12 09:40:28 +02:00
Adriane Boyd
b5af0fe836
Revert "Use Latin normalization for Serbian attrs ( #12608 )" ( #12621 )
...
This reverts commit 6f314f99c4
.
We are reverting this until we can support this normalization more
consistently across vectors, training corpora, and lemmatizer data.
2023-05-11 11:54:16 +02:00
royashcenazi
3252f6b13f
Parsigs universe 3 ( #12617 )
...
* parsigs universe
* added model installation explanation in the description
* Update website/meta/universe.json
Co-authored-by: Basile Dura <bdura@users.noreply.github.com>
* added model installement instruction in the code example
* added biomedical category
---------
Co-authored-by: Basile Dura <bdura@users.noreply.github.com>
2023-05-10 13:49:51 +02:00
royashcenazi
a56ab98e3c
parsigs universe ( #12616 )
...
* parsigs universe
* added model installation explanation in the description
* Update website/meta/universe.json
Co-authored-by: Basile Dura <bdura@users.noreply.github.com>
* added model installement instruction in the code example
---------
Co-authored-by: Basile Dura <bdura@users.noreply.github.com>
2023-05-10 13:19:28 +02:00
David Berenstein
d11b549195
chore: added adept-augmentations to the spacy universe ( #12609 )
...
* chore: added adept-augmentations to the spacy universe
* Apply suggestions from code review
Co-authored-by: Basile Dura <bdura@users.noreply.github.com>
* Update universe.json
---------
Co-authored-by: Basile Dura <bdura@users.noreply.github.com>
2023-05-10 13:16:16 +02:00
Patrick J. Burns
15f16db6ca
Fix typo ( #12615 )
2023-05-09 15:52:34 +02:00
kadarakos
11a17976ec
black
2023-05-09 11:32:02 +00:00
Patrick J. Burns
eb3960a15a
Add LatinCy models to universe.json ( #12597 )
...
* Add LatinCy models to universe.json
* Update website/meta/universe.json
Add install code for LatinCy models to 'code_example'
Co-authored-by: Adriane Boyd <adrianeboyd@gmail.com>
* Update LatinCy ‘code_example’ in website/meta/universe.json
Co-authored-by: Adriane Boyd <adrianeboyd@gmail.com>
---------
Co-authored-by: Adriane Boyd <adrianeboyd@gmail.com>
2023-05-09 12:02:45 +02:00