* Fix NER check in CoNLL-U converter
Leave ents unset if no NER annotation is found in the MISC column.
* Revert to global rather than per-sentence NER check
* Update spacy/training/converters/conllu_to_docs.py
Co-authored-by: Sofie Van Landeghem <svlandeg@users.noreply.github.com>
Co-authored-by: Sofie Van Landeghem <svlandeg@users.noreply.github.com>
* change '_' to '' to allow Token.pos, when no value for token pos in conllu data
* Minor code style
Co-authored-by: Adriane Boyd <adrianeboyd@gmail.com>
* Switch converters to generator functions
To reduce the memory usage when converting large corpora, refactor the
convert methods to be generator functions.
* Update tests
This doesn't make a difference given how the `merged_morph` values
override the `morph` values for all the final docs, but could have led
to unexpected bugs in the future if the converter is modified.