Raphael Mitsch
7ff3d94c9c
Re-add Candidate to setup.py.
2023-03-20 09:55:05 +01:00
Raphael Mitsch
48744b02b1
Update return type in EL tests.
2023-03-20 09:35:50 +01:00
Raphael Mitsch
1f2685029f
Reverse erroneous changes during merge.
2023-03-20 09:30:39 +01:00
Raphael Mitsch
1620a04d46
Merge branch 'v4' into refactor/span-group-for-mentions
...
# Conflicts:
# spacy/errors.py
# spacy/kb/__init__.py
# spacy/kb/candidate.pxd
# spacy/kb/candidate.pyx
# spacy/kb/kb.pyx
# spacy/kb/kb_in_memory.pyx
# spacy/ml/models/entity_linker.py
# spacy/pipeline/entity_linker.py
# spacy/tests/pipeline/test_entity_linker.py
# spacy/tests/serialize/test_serialize_kb.py
# website/docs/api/inmemorylookupkb.mdx
# website/docs/api/kb.mdx
2023-03-20 09:29:52 +01:00
Raphael Mitsch
9340eb8ad2
Introduce hierarchy for EL Candidate
objects ( #12341 )
...
* Convert Candidate from Cython to Python class.
* Format.
* Fix .entity_ typo in _add_activations() usage.
* Update spacy/kb/candidate.py
Co-authored-by: Sofie Van Landeghem <svlandeg@users.noreply.github.com>
* Update doc string of BaseCandidate.__init__().
* Update spacy/kb/candidate.py
Co-authored-by: Sofie Van Landeghem <svlandeg@users.noreply.github.com>
* Rename Candidate to InMemoryCandidate, BaseCandidate to Candidate.
* Adjust Candidate to support and mandate numerical entity IDs.
* Format.
* Fix docstring and docs.
* Update website/docs/api/kb.mdx
Co-authored-by: Sofie Van Landeghem <svlandeg@users.noreply.github.com>
* Rename alias -> mention.
* Refactor Candidate attribute names. Update docs and tests accordingly.
* Refacor Candidate attributes and their usage.
* Format.
* Fix mypy error.
* Update error code in line with v4 convention.
* Update spacy/kb/candidate.py
Co-authored-by: Sofie Van Landeghem <svlandeg@users.noreply.github.com>
* Updated error code.
* Simplify interface for int/str representations.
* Update website/docs/api/kb.mdx
Co-authored-by: Sofie Van Landeghem <svlandeg@users.noreply.github.com>
* Rename 'alias' to 'mention'.
* Port Candidate and InMemoryCandidate to Cython.
* Remove redundant entry in setup.py.
* Add abstract class check.
* Drop storing mention.
* Update spacy/kb/candidate.pxd
Co-authored-by: Sofie Van Landeghem <svlandeg@users.noreply.github.com>
* Fix entity_id refactoring problems in docstrings.
* Drop unused InMemoryCandidate._entity_hash.
* Update docstrings.
* Move attributes out of Candidate.
* Partially fix alias/mention terminology usage. Convert Candidate to interface.
* Remove prior_prob from supported properties in Candidate. Introduce KnowledgeBase.supports_prior_probs().
* Update docstrings related to prior_prob.
* Update alias/mention usage in doc(strings).
* Update spacy/ml/models/entity_linker.py
Co-authored-by: Sofie Van Landeghem <svlandeg@users.noreply.github.com>
* Update spacy/ml/models/entity_linker.py
Co-authored-by: Sofie Van Landeghem <svlandeg@users.noreply.github.com>
* Mention -> alias renaming. Drop Candidate.mentions(). Drop InMemoryLookupKB.get_alias_candidates() from docs.
* Update docstrings.
* Fix InMemoryCandidate attribute names.
* Update spacy/kb/kb.pyx
Co-authored-by: Sofie Van Landeghem <svlandeg@users.noreply.github.com>
* Update spacy/ml/models/entity_linker.py
Co-authored-by: Sofie Van Landeghem <svlandeg@users.noreply.github.com>
* Update W401 test.
* Update spacy/errors.py
Co-authored-by: Sofie Van Landeghem <svlandeg@users.noreply.github.com>
* Update spacy/kb/kb.pyx
Co-authored-by: Sofie Van Landeghem <svlandeg@users.noreply.github.com>
* Use Candidate output type for toy generators in the test suite to mimick best practices
* fix docs
* fix import
---------
Co-authored-by: Sofie Van Landeghem <svlandeg@users.noreply.github.com>
2023-03-20 00:34:35 +01:00
Adriane Boyd
6ae7618418
Clean up Vocab constructor ( #12290 )
...
* Clean up Vocab constructor
* Change effective type of `strings` from `Iterable[str]` to `Optional[StringStore]`
* Don't automatically add strings to vocab
* Change default values to `None`
* Remove `**deprecated_kwargs`
* Format
2023-03-19 23:41:20 +01:00
Madeesh Kannan
520279ff7c
Tok2Vec
: Add distill
method (#12108 )
...
* `Tok2Vec`: Add `distill` method
* `Tok2Vec`: Refactor `update`
* Add `Tok2Vec.distill` test
* Update `distill` signature to accept `Example`s instead of separate teacher and student docs
* Add docs
* Remove docstring
* Update test
* Remove `update` calls from test
* Update `Tok2Vec.distill` docstring
2023-03-09 09:37:19 +01:00
Raphael Mitsch
41b3a0d932
Drop support for EntityLinker_v1. ( #12377 )
2023-03-07 13:10:45 +01:00
Raphael Mitsch
86703da8b7
Merge branch 'refactor/el-candidates' into refactor/span-group-for-mentions
...
# Conflicts:
# spacy/pipeline/entity_linker.py
2023-03-07 09:10:10 +01:00
Raphael Mitsch
8dbb74c9c0
Merge branch 'v4' into refactor/el-candidates
2023-03-07 09:06:51 +01:00
Adriane Boyd
8ca71f9591
Merge pull request #12371 from rmitsch/sync/master-into-v4
...
Sync `v4` with latest from `master`
2023-03-06 17:10:19 +01:00
Raphael Mitsch
749e446ee3
Merge branch 'master' into sync/master-into-v4
...
# Conflicts:
# .github/azure-steps.yml
2023-03-06 16:27:56 +01:00
Adriane Boyd
0bbc620dd8
Partially work around pending deprecation of pkg_resources ( #12368 )
...
* Handle deprecation of pkg_resources
* Replace `pkg_resources` with `importlib_metadata` for `spacy info
--url`
* Remove requirements check from `spacy project` given the lack of
alternatives
* Fix installed model URL method and CI test
* Fix types/handling, simplify catch-all return
* Move imports instead of disabling requirements check
* Format
* Reenable test with ignored deprecation warning
* Fix except
* Fix return
2023-03-06 14:48:57 +01:00
Raphael Mitsch
2ac586fdb5
Update error code in line with v4 convention.
2023-03-05 14:43:32 +01:00
Raphael Mitsch
670e1ca7c5
Fix mypy error.
2023-03-05 14:33:32 +01:00
Raphael Mitsch
5f40b3e523
Format.
2023-03-05 14:14:16 +01:00
Raphael Mitsch
38dce966e5
Refacor Candidate attributes and their usage.
2023-03-05 13:49:13 +01:00
Raphael Mitsch
94e57d0ed5
Refactor Candidate attribute names. Update docs and tests accordingly.
2023-03-03 11:08:17 +01:00
Raphael Mitsch
46fe069f87
Rename alias -> mention.
2023-03-03 10:29:53 +01:00
Raphael Mitsch
61bacf81bd
Update website/docs/api/kb.mdx
...
Co-authored-by: Sofie Van Landeghem <svlandeg@users.noreply.github.com>
2023-03-03 09:54:28 +01:00
Sofie Van Landeghem
04f41854c1
Merge pull request #12356 from rmitsch/sync/master-into-v4
...
Sync `v4` with latest from `master`
2023-03-03 09:31:45 +01:00
Raphael Mitsch
3beda2b23a
Merge branch 'refactor/el-candidates' into refactor/span-group-for-mentions
...
# Conflicts:
# spacy/ml/models/entity_linker.py
# website/docs/api/inmemorylookupkb.mdx
2023-03-03 08:32:38 +01:00
Raphael Mitsch
1ea31552be
Merge branch 'master' into sync/master-into-v4
...
# Conflicts:
# requirements.txt
# spacy/pipeline/entity_linker.py
# spacy/util.py
# website/docs/api/entitylinker.mdx
2023-03-02 16:24:15 +01:00
Raphael Mitsch
6aa6b86d49
Make generation of empty KnowledgeBase
instances configurable in EntityLinker
( #12320 )
...
* Make empty_kb() configurable.
* Format.
* Update docs.
* Be more specific in KB serialization test.
* Update KB serialization tests. Update docs.
* Remove doc update for batched candidate generation.
* Fix serialization of subclassed KB in tests.
* Format.
* Update docstring.
* Update docstring.
* Switch from pickle to json for custom field serialization.
2023-03-01 16:02:55 +01:00
Adriane Boyd
da75896ef5
Return Tuple[Span] for all Doc/Span attrs that provide spans ( #12288 )
...
* Return Tuple[Span] for all Doc/Span attrs that provide spans
* Update Span types
2023-03-01 16:00:02 +01:00
kadarakos
56aa0cc75f
Displacy doc fix ( #12352 )
...
* more details for color setting
* more details for color setting
* prettier
2023-03-01 15:38:23 +01:00
Raphael Mitsch
9bd498cdae
Fix docstring and docs.
2023-03-01 15:09:24 +01:00
Raphael Mitsch
257bca3959
Format.
2023-03-01 14:54:03 +01:00
Raphael Mitsch
fa390618c8
Adjust Candidate to support and mandate numerical entity IDs.
2023-03-01 14:50:58 +01:00
Raphael Mitsch
49abf4fb3a
Rename Candidate to InMemoryCandidate, BaseCandidate to Candidate.
2023-03-01 14:27:50 +01:00
Raphael Mitsch
417e8fea8b
Update spacy/kb/candidate.py
...
Co-authored-by: Sofie Van Landeghem <svlandeg@users.noreply.github.com>
2023-03-01 13:51:33 +01:00
Raphael Mitsch
21fa22de08
Merge branch 'refactor/el-candidates' of github.com:rmitsch/spaCy into refactor/el-candidates
2023-03-01 13:48:46 +01:00
Raphael Mitsch
3da0712582
Update doc string of BaseCandidate.__init__().
2023-03-01 13:15:38 +01:00
Raphael Mitsch
0680958476
Update spacy/kb/candidate.py
...
Co-authored-by: Sofie Van Landeghem <svlandeg@users.noreply.github.com>
2023-03-01 12:42:08 +01:00
Sofie Van Landeghem
74cae47bf6
rely on is_empty property instead of __len__ ( #12347 )
2023-03-01 12:06:07 +01:00
Raphael Mitsch
efbc3d37b3
Update docs w.r.t. spacy.CandidateBatchGenerator.v1. ( #12350 )
2023-03-01 11:01:35 +01:00
Adriane Boyd
33864f1d07
Add new tags in docs for #12334 ( #12348 )
2023-03-01 10:46:13 +01:00
Adriane Boyd
8f058e39bd
Fix error message for displacy auto_select_port ( #12343 )
2023-02-28 16:36:03 +01:00
Raphael Mitsch
50b34751eb
Update docs.
2023-02-28 15:38:28 +01:00
Raphael Mitsch
8596fb8b88
Change type for mentions to look up entity candidates for to SpanGroup from Iterable[Span].
2023-02-28 15:28:05 +01:00
TAN Long
071667376a
Add new REL_OPs: >+
, >-
, <+
, and <-
( #12334 )
...
* Add immediate left/right child/parent dependency relations
* Add tests for new REL_OPs: `>+`, `>-`, `<+`, and `<-`.
---------
Co-authored-by: Tan Long <tanloong@foxmail.com>
2023-02-28 14:36:33 +01:00
Raphael Mitsch
a97ef65b33
Fix .entity_ typo in _add_activations() usage.
2023-02-28 14:22:27 +01:00
Raphael Mitsch
5a9d8ba73c
Format.
2023-02-28 13:56:13 +01:00
Raphael Mitsch
cd98ab4e95
Convert Candidate from Cython to Python class.
2023-02-28 13:49:52 +01:00
lise-brinck
e2de188cf1
Bugfix/swedish tokenizer ( #12315 )
...
* add unittest for explosion#12311
* create punctuation.py for swedish
* removed : from infixes in swedish punctuation.py
* allow : as infix if succeeding char is uppercase
2023-02-27 10:53:45 +01:00
Adriane Boyd
4539fbae17
Revert "Fix FUZZY operator definition ( #12318 )" ( #12336 )
...
This reverts commit daedc45d05
.
The default length depends on the length of the pattern string and was
correct for this example.
2023-02-27 09:48:36 +01:00
Kevin Humphreys
acdd993071
Matcher performance fix for extension predicates: use shared key function ( #12272 )
...
* standardize predicate key format
* single key function
* Make optional args in key function keyword-only
---------
Co-authored-by: Adriane Boyd <adrianeboyd@gmail.com>
2023-02-27 08:35:08 +01:00
Adriane Boyd
df4c069a13
Remove backoff from .vector to .tensor ( #12292 )
2023-02-23 11:36:50 +01:00
Paul O'Leary McCann
1e8bac99f3
Add tests for projects to master ( #12303 )
...
* Add tests for projects to master
* Fix git clone related issues on Windows
* Add stat import
2023-02-23 10:22:57 +01:00
andyjessen
daedc45d05
Fix FUZZY operator definition ( #12318 )
...
* Fix FUZZY operator definition
The default length of the FUZZY operator is 2 and not 3.
* adjust edit distance in matcher usage docs too
---------
Co-authored-by: svlandeg <svlandeg@github.com>
2023-02-23 09:37:40 +01:00