mirror of
https://github.com/explosion/spaCy.git
synced 2024-12-24 17:06:29 +03:00
Temporary work-around for scoring a subset of components (#6090)
* Try hacking the scorer to work around sentence boundaries * Upd scorer * Set dev version * Upd scorer hack * Fix version * Improve comment on hack
This commit is contained in:
parent
d32ce121be
commit
bbdb5f62b7
|
@ -270,6 +270,18 @@ class Scorer:
|
|||
for example in examples:
|
||||
pred_doc = example.predicted
|
||||
gold_doc = example.reference
|
||||
# TODO
|
||||
# This is a temporary hack to work around the problem that the scorer
|
||||
# fails if you have examples that are not fully annotated for all
|
||||
# the tasks in your pipeline. For instance, you might have a corpus
|
||||
# of NER annotations that does not set sentence boundaries, but the
|
||||
# pipeline includes a parser or senter, and then the score_weights
|
||||
# are used to evaluate that component. When the scorer attempts
|
||||
# to read the sentences from the gold document, it fails.
|
||||
try:
|
||||
list(getter(gold_doc, attr))
|
||||
except ValueError:
|
||||
continue
|
||||
# Find all labels in gold and doc
|
||||
labels = set(
|
||||
[k.label_ for k in getter(gold_doc, attr)]
|
||||
|
|
Loading…
Reference in New Issue
Block a user