Commit Graph

24 Commits

Author SHA1 Message Date
Raphael Mitsch
0a36f9d9e1 Merge branch 'master' into feature/candidate-generation-by-docs
# Conflicts:
#	spacy/kb/kb_in_memory.pyx
#	spacy/pipeline/entity_linker.py
#	spacy/tests/doc/test_span.py
#	spacy/tests/pipeline/test_entity_linker.py
#	spacy/tokens/span.pyx
2023-04-19 09:49:11 +02:00
Sofie Van Landeghem
74cae47bf6
rely on is_empty property instead of __len__ (#12347) 2023-03-01 12:06:07 +01:00
Adriane Boyd
3b8918e166
API docs: Rename kb_in_memory to inmemorylookupkb, add to sidebar (#12128)
* API docs: Rename kb_in_memory to inmemorylookupkb, add to sidebar

* adjust to mdx

* linkout to InMemoryLookupKB at first occurrence in kb.mdx

* fix links to docs

* revert Azure trigger setting (I'll make a separate PR)

Co-authored-by: svlandeg <svlandeg@github.com>
2023-01-19 13:29:17 +01:00
Raphael Mitsch
b6bc6885d9 Switch to SpanGroup (from Doc) for bundling Spans for candidate retrieval. 2022-12-15 10:17:25 +01:00
Raphael Mitsch
51c485da09 Fix candidate retrieval interface. 2022-12-14 11:53:39 +01:00
Raphael Mitsch
53a24abd8b Modify candidate retrieval interface to accept docs instead of individual spans. 2022-12-14 11:51:37 +01:00
Raphael Mitsch
cb640abe81 Fix EL test. 2022-12-12 14:04:34 +01:00
Raphael Mitsch
2870c8f4d6 Remove kwargs from KnowledgeBase.generate_from_disk(). 2022-12-05 16:38:02 +01:00
Raphael Mitsch
ff7fc0850d Add kwargs to KnowledgeBase.generate_from_disk(). 2022-12-05 16:35:03 +01:00
Raphael Mitsch
60eda0d7a5 Update setup.py. Remove temporary comments. 2022-11-29 15:15:47 +01:00
Raphael Mitsch
3e668503de Finish Candidate refactoring. 2022-11-29 15:03:54 +01:00
Raphael Mitsch
75aee55bc3 Start refactoring of Candidate classes. 2022-11-28 17:29:35 +01:00
Raphael Mitsch
7e6888dcd4 Add empty_kb() as config argument. 2022-11-28 10:46:02 +01:00
Raphael Mitsch
b1d458eca7 Add generate_from_disk() factory method. 2022-11-25 12:02:37 +01:00
Raphael Mitsch
4eb072fa91 Add abstract method KnowledgeBase.__len__(). 2022-11-23 21:24:17 +01:00
Raphael Mitsch
1480009715 Make entity_vector_length available in Python. 2022-11-16 16:16:20 +01:00
Raphael Mitsch
aa2b5122b6 Make entity_vector_length available in Python. 2022-11-16 16:07:39 +01:00
Raphael Mitsch
d6d4c45eef Make entity_vector_length writable. 2022-11-16 15:52:34 +01:00
Raphael Mitsch
b572e2473a Update docstring. 2022-11-09 14:31:22 +01:00
Raphael Mitsch
c5b15e0e04 Update docstring. 2022-11-09 14:31:08 +01:00
Raphael Mitsch
b398cca5cc Replace leftover Generator typing with Iterator. 2022-11-04 12:46:03 +01:00
Raphael Mitsch
b32f48c878 Change typing from Generator to Iterable. 2022-10-20 09:43:47 +02:00
Raphael Mitsch
7c28424f47 Convert batched into doc-wise batched candidate generation. 2022-10-18 15:31:15 +02:00
Raphael Mitsch
1f23c615d7
Refactor KB for easier customization (#11268)
* Add implementation of batching + backwards compatibility fixes. Tests indicate issue with batch disambiguation for custom singular entity lookups.

* Fix tests. Add distinction w.r.t. batch size.

* Remove redundant and add new comments.

* Adjust comments. Fix variable naming in EL prediction.

* Fix mypy errors.

* Remove KB entity type config option. Change return types of candidate retrieval functions to Iterable from Iterator. Fix various other issues.

* Update spacy/pipeline/entity_linker.py

Co-authored-by: Paul O'Leary McCann <polm@dampfkraft.com>

* Update spacy/pipeline/entity_linker.py

Co-authored-by: Paul O'Leary McCann <polm@dampfkraft.com>

* Update spacy/kb_base.pyx

Co-authored-by: Paul O'Leary McCann <polm@dampfkraft.com>

* Update spacy/kb_base.pyx

Co-authored-by: Paul O'Leary McCann <polm@dampfkraft.com>

* Update spacy/pipeline/entity_linker.py

Co-authored-by: Paul O'Leary McCann <polm@dampfkraft.com>

* Add error messages to NotImplementedErrors. Remove redundant comment.

* Fix imports.

* Remove redundant comments.

* Rename KnowledgeBase to InMemoryLookupKB and BaseKnowledgeBase to KnowledgeBase.

* Fix tests.

* Update spacy/errors.py

Co-authored-by: Sofie Van Landeghem <svlandeg@users.noreply.github.com>

* Move KB into subdirectory.

* Adjust imports after KB move to dedicated subdirectory.

* Fix config imports.

* Move Candidate + retrieval functions to separate module. Fix other, small issues.

* Fix docstrings and error message w.r.t. class names. Fix typing for candidate retrieval functions.

* Update spacy/kb/kb_in_memory.pyx

Co-authored-by: Sofie Van Landeghem <svlandeg@users.noreply.github.com>

* Update spacy/ml/models/entity_linker.py

Co-authored-by: Sofie Van Landeghem <svlandeg@users.noreply.github.com>

* Fix typing.

* Change typing of mentions to be Span instead of Union[Span, str].

* Update docs.

* Update EntityLinker and _architecture docs.

* Update website/docs/api/entitylinker.md

Co-authored-by: Paul O'Leary McCann <polm@dampfkraft.com>

* Adjust message for E1046.

* Re-add section for Candidate in kb.md, add reference to dedicated page.

* Update docs and docstrings.

* Re-add section + reference for KnowledgeBase.get_alias_candidates() in docs.

* Update spacy/kb/candidate.pyx

* Update spacy/kb/kb_in_memory.pyx

* Update spacy/pipeline/legacy/entity_linker.py

* Remove canididate.md. Remove mistakenly added config snippet in entity_linker.py.

Co-authored-by: Paul O'Leary McCann <polm@dampfkraft.com>
Co-authored-by: Sofie Van Landeghem <svlandeg@users.noreply.github.com>
2022-09-08 10:38:07 +02:00