Raphael Mitsch
78c72d3ab7
Merge branch 'main' into feature/docwise-generator-batching
2024-01-30 21:00:22 +01:00
Daniël de Kok
81beaea70e
Merge remote-tracking branch 'upstream/master' into maintenance/v4-merge-master-20240119
2024-01-19 12:34:29 +01:00
Adriane Boyd
538304948e
Remove profile=True from currently profiled cython
2023-09-28 17:09:41 +02:00
Raphael Mitsch
5bad3d2118
Format.
2023-07-27 16:36:15 +02:00
Raphael Mitsch
a2585333a9
Fix merge errors.
2023-07-27 16:27:59 +02:00
Raphael Mitsch
8aa59c4f65
Merge branch 'v4' into feature/docwise-generator-batching
...
# Conflicts:
# spacy/kb/kb.pyx
# spacy/kb/kb_in_memory.pyx
# spacy/ml/models/entity_linker.py
# spacy/pipeline/entity_linker.py
# spacy/tests/pipeline/test_entity_linker.py
# website/docs/api/entitylinker.mdx
2023-07-27 14:28:06 +02:00
svlandeg
0e3b6a87d6
Merge branch 'upstream_master' into sync_v4
2023-07-19 16:37:31 +02:00
Basile Dura
b0228d8ea6
ci: add cython linter ( #12694 )
...
* chore: add cython-linter dev dependency
* fix: lexeme.pyx
* fix: morphology.pxd
* fix: tokenizer.pxd
* fix: vocab.pxd
* fix: morphology.pxd (line length)
* ci: add cython-lint
* ci: fix cython-lint call
* Fix kb/candidate.pyx.
* Fix kb/kb.pyx.
* Fix kb/kb_in_memory.pyx.
* Fix kb.
* Fix training/ partially.
* Fix training/. Ignore trailing whitespaces and too long lines.
* Fix ml/.
* Fix matcher/.
* Fix pipeline/.
* Fix tokens/.
* Fix build errors. Fix vocab.pyx.
* Fix cython-lint install and run.
* Fix lexeme.pyx, parts_of_speech.pxd, vectors.pyx. Temporarily disable cython-lint execution.
* Fix attrs.pyx, lexeme.pyx, symbols.pxd, isort issues.
* Make cython-lint install conditional. Fix tokenizer.pyx.
* Fix remaining files. Reenable cython-lint check.
* Readded parentheses.
* Fix test_build_dependencies().
* Add explanatory comment to cython-lint execution.
---------
Co-authored-by: Raphael Mitsch <r.mitsch@outlook.com>
2023-07-19 12:03:31 +02:00
Daniël de Kok
2468742cb8
isort all the things
2023-06-26 11:41:03 +02:00
Daniël de Kok
e2b70df012
Configure isort to use the Black profile, recursively isort the spacy module ( #12721 )
...
* Use isort with Black profile
* isort all the things
* Fix import cycles as a result of import sorting
* Add DOCBIN_ALL_ATTRS type definition
* Add isort to requirements
* Remove isort from build dependencies check
* Typo
2023-06-14 17:48:41 +02:00
Raphael Mitsch
10ddefa686
Update spacy/kb/kb_in_memory.pyx
...
Co-authored-by: Sofie Van Landeghem <svlandeg@users.noreply.github.com>
2023-04-24 20:44:37 +02:00
Raphael Mitsch
cb79af3a10
Fix merge leftovers.
2023-03-20 10:31:11 +01:00
Raphael Mitsch
73bdeb01e4
Merge branch 'refactor/el-candidates' into feature/docwise-generator-batching
...
# Conflicts:
# spacy/kb/candidate.py
# spacy/kb/kb.pyx
# spacy/kb/kb_in_memory.pyx
# spacy/ml/models/entity_linker.py
# spacy/pipeline/entity_linker.py
# spacy/tests/pipeline/test_entity_linker.py
# website/docs/api/inmemorylookupkb.mdx
# website/docs/api/kb.mdx
2023-03-20 10:24:17 +01:00
Raphael Mitsch
9340eb8ad2
Introduce hierarchy for EL Candidate objects ( #12341 )
...
* Convert Candidate from Cython to Python class.
* Format.
* Fix .entity_ typo in _add_activations() usage.
* Update spacy/kb/candidate.py
Co-authored-by: Sofie Van Landeghem <svlandeg@users.noreply.github.com>
* Update doc string of BaseCandidate.__init__().
* Update spacy/kb/candidate.py
Co-authored-by: Sofie Van Landeghem <svlandeg@users.noreply.github.com>
* Rename Candidate to InMemoryCandidate, BaseCandidate to Candidate.
* Adjust Candidate to support and mandate numerical entity IDs.
* Format.
* Fix docstring and docs.
* Update website/docs/api/kb.mdx
Co-authored-by: Sofie Van Landeghem <svlandeg@users.noreply.github.com>
* Rename alias -> mention.
* Refactor Candidate attribute names. Update docs and tests accordingly.
* Refacor Candidate attributes and their usage.
* Format.
* Fix mypy error.
* Update error code in line with v4 convention.
* Update spacy/kb/candidate.py
Co-authored-by: Sofie Van Landeghem <svlandeg@users.noreply.github.com>
* Updated error code.
* Simplify interface for int/str representations.
* Update website/docs/api/kb.mdx
Co-authored-by: Sofie Van Landeghem <svlandeg@users.noreply.github.com>
* Rename 'alias' to 'mention'.
* Port Candidate and InMemoryCandidate to Cython.
* Remove redundant entry in setup.py.
* Add abstract class check.
* Drop storing mention.
* Update spacy/kb/candidate.pxd
Co-authored-by: Sofie Van Landeghem <svlandeg@users.noreply.github.com>
* Fix entity_id refactoring problems in docstrings.
* Drop unused InMemoryCandidate._entity_hash.
* Update docstrings.
* Move attributes out of Candidate.
* Partially fix alias/mention terminology usage. Convert Candidate to interface.
* Remove prior_prob from supported properties in Candidate. Introduce KnowledgeBase.supports_prior_probs().
* Update docstrings related to prior_prob.
* Update alias/mention usage in doc(strings).
* Update spacy/ml/models/entity_linker.py
Co-authored-by: Sofie Van Landeghem <svlandeg@users.noreply.github.com>
* Update spacy/ml/models/entity_linker.py
Co-authored-by: Sofie Van Landeghem <svlandeg@users.noreply.github.com>
* Mention -> alias renaming. Drop Candidate.mentions(). Drop InMemoryLookupKB.get_alias_candidates() from docs.
* Update docstrings.
* Fix InMemoryCandidate attribute names.
* Update spacy/kb/kb.pyx
Co-authored-by: Sofie Van Landeghem <svlandeg@users.noreply.github.com>
* Update spacy/ml/models/entity_linker.py
Co-authored-by: Sofie Van Landeghem <svlandeg@users.noreply.github.com>
* Update W401 test.
* Update spacy/errors.py
Co-authored-by: Sofie Van Landeghem <svlandeg@users.noreply.github.com>
* Update spacy/kb/kb.pyx
Co-authored-by: Sofie Van Landeghem <svlandeg@users.noreply.github.com>
* Use Candidate output type for toy generators in the test suite to mimick best practices
* fix docs
* fix import
---------
Co-authored-by: Sofie Van Landeghem <svlandeg@users.noreply.github.com>
2023-03-20 00:34:35 +01:00
Raphael Mitsch
28dbed64cb
Update alias/mention usage in doc(strings).
2023-03-14 13:33:05 +01:00
Raphael Mitsch
4a921766f1
Remove prior_prob from supported properties in Candidate. Introduce KnowledgeBase.supports_prior_probs().
2023-03-13 16:54:38 +01:00
Raphael Mitsch
6adc15178f
Partially fix alias/mention terminology usage. Convert Candidate to interface.
2023-03-13 14:26:14 +01:00
Raphael Mitsch
c61654eef8
Drop storing mention.
2023-03-09 15:04:10 +01:00
Raphael Mitsch
b476041417
Port Candidate and InMemoryCandidate to Cython.
2023-03-09 14:44:41 +01:00
Raphael Mitsch
1c937db3af
Rename 'alias' to 'mention'.
2023-03-09 12:06:15 +01:00
Raphael Mitsch
8dbb74c9c0
Merge branch 'v4' into refactor/el-candidates
2023-03-07 09:06:51 +01:00
Raphael Mitsch
f33f0ed160
Merge branch 'v4' into feature/docwise-generator-batching
...
# Conflicts:
# spacy/pipeline/entity_linker.py
# website/docs/api/entitylinker.mdx
2023-03-06 10:21:12 +01:00
Raphael Mitsch
bb7418ebdd
Modify EL batching system.
2023-03-06 10:05:46 +01:00
Raphael Mitsch
38dce966e5
Refacor Candidate attributes and their usage.
2023-03-05 13:49:13 +01:00
Raphael Mitsch
94e57d0ed5
Refactor Candidate attribute names. Update docs and tests accordingly.
2023-03-03 11:08:17 +01:00
Raphael Mitsch
49abf4fb3a
Rename Candidate to InMemoryCandidate, BaseCandidate to Candidate.
2023-03-01 14:27:50 +01:00
Sofie Van Landeghem
74cae47bf6
rely on is_empty property instead of __len__ ( #12347 )
2023-03-01 12:06:07 +01:00
Raphael Mitsch
cd98ab4e95
Convert Candidate from Cython to Python class.
2023-02-28 13:49:52 +01:00
Adriane Boyd
3b8918e166
API docs: Rename kb_in_memory to inmemorylookupkb, add to sidebar ( #12128 )
...
* API docs: Rename kb_in_memory to inmemorylookupkb, add to sidebar
* adjust to mdx
* linkout to InMemoryLookupKB at first occurrence in kb.mdx
* fix links to docs
* revert Azure trigger setting (I'll make a separate PR)
Co-authored-by: svlandeg <svlandeg@github.com>
2023-01-19 13:29:17 +01:00
Raphael Mitsch
1f23c615d7
Refactor KB for easier customization ( #11268 )
...
* Add implementation of batching + backwards compatibility fixes. Tests indicate issue with batch disambiguation for custom singular entity lookups.
* Fix tests. Add distinction w.r.t. batch size.
* Remove redundant and add new comments.
* Adjust comments. Fix variable naming in EL prediction.
* Fix mypy errors.
* Remove KB entity type config option. Change return types of candidate retrieval functions to Iterable from Iterator. Fix various other issues.
* Update spacy/pipeline/entity_linker.py
Co-authored-by: Paul O'Leary McCann <polm@dampfkraft.com>
* Update spacy/pipeline/entity_linker.py
Co-authored-by: Paul O'Leary McCann <polm@dampfkraft.com>
* Update spacy/kb_base.pyx
Co-authored-by: Paul O'Leary McCann <polm@dampfkraft.com>
* Update spacy/kb_base.pyx
Co-authored-by: Paul O'Leary McCann <polm@dampfkraft.com>
* Update spacy/pipeline/entity_linker.py
Co-authored-by: Paul O'Leary McCann <polm@dampfkraft.com>
* Add error messages to NotImplementedErrors. Remove redundant comment.
* Fix imports.
* Remove redundant comments.
* Rename KnowledgeBase to InMemoryLookupKB and BaseKnowledgeBase to KnowledgeBase.
* Fix tests.
* Update spacy/errors.py
Co-authored-by: Sofie Van Landeghem <svlandeg@users.noreply.github.com>
* Move KB into subdirectory.
* Adjust imports after KB move to dedicated subdirectory.
* Fix config imports.
* Move Candidate + retrieval functions to separate module. Fix other, small issues.
* Fix docstrings and error message w.r.t. class names. Fix typing for candidate retrieval functions.
* Update spacy/kb/kb_in_memory.pyx
Co-authored-by: Sofie Van Landeghem <svlandeg@users.noreply.github.com>
* Update spacy/ml/models/entity_linker.py
Co-authored-by: Sofie Van Landeghem <svlandeg@users.noreply.github.com>
* Fix typing.
* Change typing of mentions to be Span instead of Union[Span, str].
* Update docs.
* Update EntityLinker and _architecture docs.
* Update website/docs/api/entitylinker.md
Co-authored-by: Paul O'Leary McCann <polm@dampfkraft.com>
* Adjust message for E1046.
* Re-add section for Candidate in kb.md, add reference to dedicated page.
* Update docs and docstrings.
* Re-add section + reference for KnowledgeBase.get_alias_candidates() in docs.
* Update spacy/kb/candidate.pyx
* Update spacy/kb/kb_in_memory.pyx
* Update spacy/pipeline/legacy/entity_linker.py
* Remove canididate.md. Remove mistakenly added config snippet in entity_linker.py.
Co-authored-by: Paul O'Leary McCann <polm@dampfkraft.com>
Co-authored-by: Sofie Van Landeghem <svlandeg@users.noreply.github.com>
2022-09-08 10:38:07 +02:00