spaCy/bin
Sofie Van Landeghem 2d249a9502 KB extensions and better parsing of WikiData (#4375)
* fix overflow error on windows

* more documentation & logging fixes

* md fix

* 3 different limit parameters to play with execution time

* bug fixes directory locations

* small fixes

* exclude dev test articles from prior probabilities stats

* small fixes

* filtering wikidata entities, removing numeric and meta items

* adding aliases from wikidata also to the KB

* fix adding WD aliases

* adding also new aliases to previously added entities

* fixing comma's

* small doc fixes

* adding subclassof filtering

* append alias functionality in KB

* prevent appending the same entity-alias pair

* fix for appending WD aliases

* remove date filter

* remove unnecessary import

* small corrections and reformatting

* remove WD aliases for now (too slow)

* removing numeric entities from training and evaluation

* small fixes

* shortcut during prediction if there is only one candidate

* add counts and fscore logging, remove FP NER from evaluation

* fix entity_linker.predict to take docs instead of single sentences

* remove enumeration sentences from the WP dataset

* entity_linker.update to process full doc instead of single sentence

* spelling corrections and dump locations in readme

* NLP IO fix

* reading KB is unnecessary at the end of the pipeline

* small logging fix

* remove empty files
2019-10-14 12:28:53 +02:00
..
ud Bugfix initializing DocBin with attributes (#4368) 2019-10-03 14:48:45 +02:00
wiki_entity_linking KB extensions and better parsing of WikiData (#4375) 2019-10-14 12:28:53 +02:00
__init__.py clean up code, remove old code, move to bin 2019-06-18 13:20:40 +02:00
cythonize.py 💫 Tidy up and auto-format .py files (#2983) 2018-11-30 17:03:03 +01:00
get-version.sh Add get-version script 2019-08-25 15:12:36 +02:00
load_reddit.py Replacing regex library with re to increase tokenization speed (#3218) 2019-02-01 18:05:22 +11:00
push-tag.sh Fix push-tag script 2019-05-11 19:04:35 +02:00
spacy Add entry point-style auto alias for "spacy" 2017-08-14 12:18:39 +02:00
train_word_vectors.py counter instead of preshcounter 2019-07-11 13:05:53 +02:00