* rename Pipe to TrainablePipe
* split functionality between Pipe and TrainablePipe
* remove unnecessary methods from certain components
* cleanup
* hasattr(component, "pipe") should be sufficient again
* remove serialization and vocab/cfg from Pipe
* unify _ensure_examples and validate_examples
* small fixes
* hasattr checks for self.cfg and self.vocab
* make is_resizable and is_trainable properties
* serialize strings.json instead of vocab
* fix KB IO + tests
* fix typos
* more typos
* _added_strings as a set
* few more tests specifically for _added_strings field
* bump to 3.0.0a36
* NEL: read sentences and ents from reference
* fiddling with sent_start annotations
* add KB serialization test
* KB write additional file with strings.json
* score_links function to calculate NEL P/R/F
* formatting
* documentation
* candidate generator as separate part of EL config
* update comment
* ent instead of str as input for candidate generation
* Span instead of str: correct type indication
* fix types
* unit test to create new candidate generator
* fix replace_pipe argument passing
* move error message, general cleanup
* add vocab back to KB constructor
* provide KB as callable from Vocab arg
* rename to kb_loader, fix KB serialization as part of the EL pipe
* fix typo
* reformatting
* cleanup
* fix comment
* fix wrongly duplicated code from merge conflict
* rename dump to to_disk
* from_disk instead of load_bulk
* update test after recent removal of set_morphology in tagger
* remove old doc
* fix overflow error on windows
* more documentation & logging fixes
* md fix
* 3 different limit parameters to play with execution time
* bug fixes directory locations
* small fixes
* exclude dev test articles from prior probabilities stats
* small fixes
* filtering wikidata entities, removing numeric and meta items
* adding aliases from wikidata also to the KB
* fix adding WD aliases
* adding also new aliases to previously added entities
* fixing comma's
* small doc fixes
* adding subclassof filtering
* append alias functionality in KB
* prevent appending the same entity-alias pair
* fix for appending WD aliases
* remove date filter
* remove unnecessary import
* small corrections and reformatting
* remove WD aliases for now (too slow)
* removing numeric entities from training and evaluation
* small fixes
* shortcut during prediction if there is only one candidate
* add counts and fscore logging, remove FP NER from evaluation
* fix entity_linker.predict to take docs instead of single sentences
* remove enumeration sentences from the WP dataset
* entity_linker.update to process full doc instead of single sentence
* spelling corrections and dump locations in readme
* NLP IO fix
* reading KB is unnecessary at the end of the pipeline
* small logging fix
* remove empty files
* document token ent_kb_id
* document span kb_id
* update pipeline documentation
* prior and context weights as bool's instead
* entitylinker api documentation
* drop for both models
* finish entitylinker documentation
* small fixes
* documentation for KB
* candidate documentation
* links to api pages in code
* small fix
* frequency examples as counts for consistency
* consistent documentation about tensors returned by predict
* add entity linking to usage 101
* add entity linking infobox and KB section to 101
* entity-linking in linguistic features
* small typo corrections
* training example and docs for entity_linker
* predefined nlp and kb
* revert back to similarity encodings for simplicity (for now)
* set prior probabilities to 0 when excluded
* code clean up
* bugfix: deleting kb ID from tokens when entities were removed
* refactor train el example to use either model or vocab
* pretrain_kb example for example kb generation
* add to training docs for KB + EL example scripts
* small fixes
* error numbering
* ensure the language of vocab and nlp stay consistent across serialization
* equality with =
* avoid conflict in errors file
* add error 151
* final adjustements to the train scripts - consistency
* update of goldparse documentation
* small corrections
* push commit
* typo fix
* add candidate API to kb documentation
* update API sidebar with EntityLinker and KnowledgeBase
* remove EL from 101 docs
* remove entity linker from 101 pipelines / rephrase
* custom el model instead of existing model
* set version to 2.2 for EL functionality
* update documentation for 2 CLI scripts
* document token ent_kb_id
* document span kb_id
* update pipeline documentation
* prior and context weights as bool's instead
* entitylinker api documentation
* drop for both models
* finish entitylinker documentation
* small fixes
* documentation for KB
* candidate documentation
* links to api pages in code
* small fix
* frequency examples as counts for consistency
* consistent documentation about tensors returned by predict
* add entity linking to usage 101
* add entity linking infobox and KB section to 101
* entity-linking in linguistic features
* small typo corrections
* training example and docs for entity_linker
* predefined nlp and kb
* revert back to similarity encodings for simplicity (for now)
* set prior probabilities to 0 when excluded
* code clean up
* bugfix: deleting kb ID from tokens when entities were removed
* refactor train el example to use either model or vocab
* pretrain_kb example for example kb generation
* add to training docs for KB + EL example scripts
* small fixes
* error numbering
* ensure the language of vocab and nlp stay consistent across serialization
* equality with =
* avoid conflict in errors file
* add error 151
* final adjustements to the train scripts - consistency
* update of goldparse documentation
* small corrections
* push commit
* turn kb_creator into CLI script (wip)
* proper parameters for training entity vectors
* wikidata pipeline split up into two executable scripts
* remove context_width
* move wikidata scripts in bin directory, remove old dummy script
* refine KB script with logs and preprocessing options
* small edits
* small improvements to logging of EL CLI script