mirror of
https://github.com/explosion/spaCy.git
synced 2025-03-14 15:12:15 +03:00
If we use a sentence splitter as one of the annotating components during training, an entity can become split in the predicted `Doc`. Before this change, training would fail, because no zero-cost transition sequence could be found. This fixes two scenarios: 1. When the gold action is `B` and a split occurs after the current token, the `BEGIN` action is invalid. However, this was the only possible zero-cost action. This change makes `OUT` a zero-cost action in this case. 2. When the gold action is `I` and a split occurs after the current token, the `IN` action is invalid, removing the only zero-cost action. This change makes `LAST` a zero-cost action, so that the entity can be properly closed. |
||
---|---|---|
.. | ||
__init__.pxd | ||
__init__.py | ||
_beam_utils.pxd | ||
_beam_utils.pyx | ||
_state.pxd | ||
_state.pyx | ||
arc_eager.pxd | ||
arc_eager.pyx | ||
ner.pxd | ||
ner.pyx | ||
nonproj.hh | ||
nonproj.pxd | ||
nonproj.pyx | ||
stateclass.pxd | ||
stateclass.pyx | ||
transition_system.pxd | ||
transition_system.pyx |