spaCy/bin
Sofie Van Landeghem e48a09df4e Example class for training data (#4543)
* OrigAnnot class instead of gold.orig_annot list of zipped tuples

* from_orig to replace from_annot_tuples

* rename to RawAnnot

* some unit tests for GoldParse creation and internal format

* removing orig_annot and switching to lists instead of tuple

* rewriting tuples to use RawAnnot (+ debug statements, WIP)

* fix pop() changing the data

* small fixes

* pop-append fixes

* return RawAnnot for existing GoldParse to have uniform interface

* clean up imports

* fix merge_sents

* add unit test for 4402 with new structure (not working yet)

* introduce DocAnnot

* typo fixes

* add unit test for merge_sents

* rename from_orig to from_raw

* fixing unit tests

* fix nn parser

* read_annots to produce text, doc_annot pairs

* _make_golds fix

* rename golds_to_gold_annots

* small fixes

* fix encoding

* have golds_to_gold_annots use DocAnnot

* missed a spot

* merge_sents as function in DocAnnot

* allow specifying only part of the token-level annotations

* refactor with Example class + underlying dicts

* pipeline components to work with Example objects (wip)

* input checking

* fix yielding

* fix calls to update

* small fixes

* fix scorer unit test with new format

* fix kwargs order

* fixes for ud and conllu scripts

* fix reading data for conllu script

* add in proper errors (not fixed numbering yet to avoid merge conflicts)

* fixing few more small bugs

* fix EL script
2019-11-11 17:35:27 +01:00
..
ud Example class for training data (#4543) 2019-11-11 17:35:27 +01:00
wiki_entity_linking Example class for training data (#4543) 2019-11-11 17:35:27 +01:00
__init__.py clean up code, remove old code, move to bin 2019-06-18 13:20:40 +02:00
cythonize.py 💫 Tidy up and auto-format .py files (#2983) 2018-11-30 17:03:03 +01:00
get-version.sh Add get-version script 2019-08-25 15:12:36 +02:00
load_reddit.py Replacing regex library with re to increase tokenization speed (#3218) 2019-02-01 18:05:22 +11:00
push-tag.sh Fix push-tag script 2019-05-11 19:04:35 +02:00
spacy Add entry point-style auto alias for "spacy" 2017-08-14 12:18:39 +02:00
train_word_vectors.py counter instead of preshcounter 2019-07-11 13:05:53 +02:00