Tweak mention limit calculation

The calculation of this in the coref-hoi code is hard to follow. Based
on comments and variable names it sounds like it's using the doc length,
but it might actually be the number of mentions? Number of mentions
should be much larger and seems more correct, but might want to revisit
this.
This commit is contained in:
Paul O'Leary McCann 2021-07-03 21:13:32 +09:00
parent 2d3c559dc4
commit 5db28ec2fd

View File

@ -267,7 +267,9 @@ def coarse_prune(
# calculate the doc length
doclen = ends[-1] - starts[0]
mlimit = min(mention_limit, int(mention_limit_ratio * doclen))
# XXX seems to make more sense to use menlen than doclen here?
#mlimit = min(mention_limit, int(mention_limit_ratio * doclen))
mlimit = min(mention_limit, int(mention_limit_ratio * menlen))
# csel is a 1d integer list
csel = select_non_crossing_spans(tops, starts, ends, mlimit)
# add the offset so these indices are absolute