spaCy

mirror of https://github.com/explosion/spaCy.git synced 2025-10-04 02:46:40 +03:00

History

Sofie Van Landeghem 0ba1b5eebc CLI scripts for entity linking (wikipedia & generic) (#4091 ) * document token ent_kb_id * document span kb_id * update pipeline documentation * prior and context weights as bool's instead * entitylinker api documentation * drop for both models * finish entitylinker documentation * small fixes * documentation for KB * candidate documentation * links to api pages in code * small fix * frequency examples as counts for consistency * consistent documentation about tensors returned by predict * add entity linking to usage 101 * add entity linking infobox and KB section to 101 * entity-linking in linguistic features * small typo corrections * training example and docs for entity_linker * predefined nlp and kb * revert back to similarity encodings for simplicity (for now) * set prior probabilities to 0 when excluded * code clean up * bugfix: deleting kb ID from tokens when entities were removed * refactor train el example to use either model or vocab * pretrain_kb example for example kb generation * add to training docs for KB + EL example scripts * small fixes * error numbering * ensure the language of vocab and nlp stay consistent across serialization * equality with = * avoid conflict in errors file * add error 151 * final adjustements to the train scripts - consistency * update of goldparse documentation * small corrections * push commit * turn kb_creator into CLI script (wip) * proper parameters for training entity vectors * wikidata pipeline split up into two executable scripts * remove context_width * move wikidata scripts in bin directory, remove old dummy script * refine KB script with logs and preprocessing options * small edits * small improvements to logging of EL CLI script		2019-08-13 15:38:59 +02:00
..
__init__.pxd	* Break up tokens.pyx into tokens/doc.pyx, tokens/token.pyx, tokens/spans.pyx	2015-07-13 20:20:58 +02:00
__init__.py	Tidy up and improve docs and docstrings (#3370 )	2019-03-08 11:42:26 +01:00
_retokenize.pyx	Update lemma and vector information after splitting a token (#4097 )	2019-08-08 15:09:44 +02:00
_serialize.py	Reformat	2019-07-11 11:49:36 +02:00
doc.pxd	cleanup	2019-07-11 13:09:22 +02:00
doc.pyx	CLI scripts for entity linking (wikipedia & generic) (#4091 )	2019-08-13 15:38:59 +02:00
span.pxd	annotate kb_id through ents in doc	2019-03-22 11:36:44 +01:00
span.pyx	Add span.tensor and token.tensor attributes	2019-08-01 18:30:50 +02:00
token.pxd	ensure Span.as_doc keeps the entity links + unit test	2019-06-25 15:28:51 +02:00
token.pyx	Add span.tensor and token.tensor attributes	2019-08-01 18:30:50 +02:00
underscore.py	💫 Improve introspection of custom extension attributes (#3729 )	2019-05-12 00:53:11 +02:00