mirror of
https://github.com/explosion/spaCy.git
synced 2025-03-13 16:05:50 +03:00
If we use a sentence splitter as one of the annotating components during training, an entity can become split in the predicted `Doc`. Before this change, training would fail, because no zero-cost transition sequence could be found. This fixes two scenarios: 1. When the gold action is `B` and a split occurs after the current token, the `BEGIN` action is invalid. However, this was the only possible zero-cost action. This change makes `OUT` a zero-cost action in this case. 2. When the gold action is `I` and a split occurs after the current token, the `IN` action is invalid, removing the only zero-cost action. This change makes `LAST` a zero-cost action, so that the entity can be properly closed. |
||
---|---|---|
.. | ||
__init__.py | ||
test_add_label.py | ||
test_arc_eager_oracle.py | ||
test_ner.py | ||
test_neural_parser.py | ||
test_nn_beam.py | ||
test_nonproj.py | ||
test_parse_navigate.py | ||
test_parse.py | ||
test_preset_sbd.py | ||
test_space_attachment.py | ||
test_state.py |