Raphael Mitsch
3102e2e27a
Entity linking: use SpanGroup
instead of Iterable[Span]
for mentions ( #12344 )
...
* Convert Candidate from Cython to Python class.
* Format.
* Fix .entity_ typo in _add_activations() usage.
* Change type for mentions to look up entity candidates for to SpanGroup from Iterable[Span].
* Update docs.
* Update spacy/kb/candidate.py
Co-authored-by: Sofie Van Landeghem <svlandeg@users.noreply.github.com>
* Update doc string of BaseCandidate.__init__().
* Update spacy/kb/candidate.py
Co-authored-by: Sofie Van Landeghem <svlandeg@users.noreply.github.com>
* Rename Candidate to InMemoryCandidate, BaseCandidate to Candidate.
* Adjust Candidate to support and mandate numerical entity IDs.
* Format.
* Fix docstring and docs.
* Update website/docs/api/kb.mdx
Co-authored-by: Sofie Van Landeghem <svlandeg@users.noreply.github.com>
* Rename alias -> mention.
* Refactor Candidate attribute names. Update docs and tests accordingly.
* Refacor Candidate attributes and their usage.
* Format.
* Fix mypy error.
* Update error code in line with v4 convention.
* Reverse erroneous changes during merge.
* Update return type in EL tests.
* Re-add Candidate to setup.py.
* Format updated docs.
---------
Co-authored-by: Sofie Van Landeghem <svlandeg@users.noreply.github.com>
2023-03-20 12:25:18 +01:00
Raphael Mitsch
9340eb8ad2
Introduce hierarchy for EL Candidate
objects ( #12341 )
...
* Convert Candidate from Cython to Python class.
* Format.
* Fix .entity_ typo in _add_activations() usage.
* Update spacy/kb/candidate.py
Co-authored-by: Sofie Van Landeghem <svlandeg@users.noreply.github.com>
* Update doc string of BaseCandidate.__init__().
* Update spacy/kb/candidate.py
Co-authored-by: Sofie Van Landeghem <svlandeg@users.noreply.github.com>
* Rename Candidate to InMemoryCandidate, BaseCandidate to Candidate.
* Adjust Candidate to support and mandate numerical entity IDs.
* Format.
* Fix docstring and docs.
* Update website/docs/api/kb.mdx
Co-authored-by: Sofie Van Landeghem <svlandeg@users.noreply.github.com>
* Rename alias -> mention.
* Refactor Candidate attribute names. Update docs and tests accordingly.
* Refacor Candidate attributes and their usage.
* Format.
* Fix mypy error.
* Update error code in line with v4 convention.
* Update spacy/kb/candidate.py
Co-authored-by: Sofie Van Landeghem <svlandeg@users.noreply.github.com>
* Updated error code.
* Simplify interface for int/str representations.
* Update website/docs/api/kb.mdx
Co-authored-by: Sofie Van Landeghem <svlandeg@users.noreply.github.com>
* Rename 'alias' to 'mention'.
* Port Candidate and InMemoryCandidate to Cython.
* Remove redundant entry in setup.py.
* Add abstract class check.
* Drop storing mention.
* Update spacy/kb/candidate.pxd
Co-authored-by: Sofie Van Landeghem <svlandeg@users.noreply.github.com>
* Fix entity_id refactoring problems in docstrings.
* Drop unused InMemoryCandidate._entity_hash.
* Update docstrings.
* Move attributes out of Candidate.
* Partially fix alias/mention terminology usage. Convert Candidate to interface.
* Remove prior_prob from supported properties in Candidate. Introduce KnowledgeBase.supports_prior_probs().
* Update docstrings related to prior_prob.
* Update alias/mention usage in doc(strings).
* Update spacy/ml/models/entity_linker.py
Co-authored-by: Sofie Van Landeghem <svlandeg@users.noreply.github.com>
* Update spacy/ml/models/entity_linker.py
Co-authored-by: Sofie Van Landeghem <svlandeg@users.noreply.github.com>
* Mention -> alias renaming. Drop Candidate.mentions(). Drop InMemoryLookupKB.get_alias_candidates() from docs.
* Update docstrings.
* Fix InMemoryCandidate attribute names.
* Update spacy/kb/kb.pyx
Co-authored-by: Sofie Van Landeghem <svlandeg@users.noreply.github.com>
* Update spacy/ml/models/entity_linker.py
Co-authored-by: Sofie Van Landeghem <svlandeg@users.noreply.github.com>
* Update W401 test.
* Update spacy/errors.py
Co-authored-by: Sofie Van Landeghem <svlandeg@users.noreply.github.com>
* Update spacy/kb/kb.pyx
Co-authored-by: Sofie Van Landeghem <svlandeg@users.noreply.github.com>
* Use Candidate output type for toy generators in the test suite to mimick best practices
* fix docs
* fix import
---------
Co-authored-by: Sofie Van Landeghem <svlandeg@users.noreply.github.com>
2023-03-20 00:34:35 +01:00
Raphael Mitsch
1f23c615d7
Refactor KB for easier customization ( #11268 )
...
* Add implementation of batching + backwards compatibility fixes. Tests indicate issue with batch disambiguation for custom singular entity lookups.
* Fix tests. Add distinction w.r.t. batch size.
* Remove redundant and add new comments.
* Adjust comments. Fix variable naming in EL prediction.
* Fix mypy errors.
* Remove KB entity type config option. Change return types of candidate retrieval functions to Iterable from Iterator. Fix various other issues.
* Update spacy/pipeline/entity_linker.py
Co-authored-by: Paul O'Leary McCann <polm@dampfkraft.com>
* Update spacy/pipeline/entity_linker.py
Co-authored-by: Paul O'Leary McCann <polm@dampfkraft.com>
* Update spacy/kb_base.pyx
Co-authored-by: Paul O'Leary McCann <polm@dampfkraft.com>
* Update spacy/kb_base.pyx
Co-authored-by: Paul O'Leary McCann <polm@dampfkraft.com>
* Update spacy/pipeline/entity_linker.py
Co-authored-by: Paul O'Leary McCann <polm@dampfkraft.com>
* Add error messages to NotImplementedErrors. Remove redundant comment.
* Fix imports.
* Remove redundant comments.
* Rename KnowledgeBase to InMemoryLookupKB and BaseKnowledgeBase to KnowledgeBase.
* Fix tests.
* Update spacy/errors.py
Co-authored-by: Sofie Van Landeghem <svlandeg@users.noreply.github.com>
* Move KB into subdirectory.
* Adjust imports after KB move to dedicated subdirectory.
* Fix config imports.
* Move Candidate + retrieval functions to separate module. Fix other, small issues.
* Fix docstrings and error message w.r.t. class names. Fix typing for candidate retrieval functions.
* Update spacy/kb/kb_in_memory.pyx
Co-authored-by: Sofie Van Landeghem <svlandeg@users.noreply.github.com>
* Update spacy/ml/models/entity_linker.py
Co-authored-by: Sofie Van Landeghem <svlandeg@users.noreply.github.com>
* Fix typing.
* Change typing of mentions to be Span instead of Union[Span, str].
* Update docs.
* Update EntityLinker and _architecture docs.
* Update website/docs/api/entitylinker.md
Co-authored-by: Paul O'Leary McCann <polm@dampfkraft.com>
* Adjust message for E1046.
* Re-add section for Candidate in kb.md, add reference to dedicated page.
* Update docs and docstrings.
* Re-add section + reference for KnowledgeBase.get_alias_candidates() in docs.
* Update spacy/kb/candidate.pyx
* Update spacy/kb/kb_in_memory.pyx
* Update spacy/pipeline/legacy/entity_linker.py
* Remove canididate.md. Remove mistakenly added config snippet in entity_linker.py.
Co-authored-by: Paul O'Leary McCann <polm@dampfkraft.com>
Co-authored-by: Sofie Van Landeghem <svlandeg@users.noreply.github.com>
2022-09-08 10:38:07 +02:00