spaCy

mirror of https://github.com/explosion/spaCy.git synced 2025-07-15 18:52:29 +03:00

History

Daniël de Kok 6906af3d8f NER: Ensure zero-cost sequence with sentence split in entity If we use a sentence splitter as one of the annotating components during training, an entity can become split in the predicted `Doc`. Before this change, training would fail, because no zero-cost transition sequence could be found. This fixes two scenarios: 1. When the gold action is `B` and a split occurs after the current token, the `BEGIN` action is invalid. However, this was the only possible zero-cost action. This change makes `OUT` a zero-cost action in this case. 2. When the gold action is `I` and a split occurs after the current token, the `IN` action is invalid, removing the only zero-cost action. This change makes `LAST` a zero-cost action, so that the entity can be properly closed.		2023-03-24 15:35:22 +01:00
..
__init__.py	Revert #4334	2019-09-29 17:32:12 +02:00
test_add_label.py	Support negative examples in partial NER annotations (#8106 )	2021-06-17 17:33:00 +10:00
test_arc_eager_oracle.py	Migrate regression tests into the main test suite (#9655 )	2021-12-04 20:34:48 +01:00
test_ner.py	NER: Ensure zero-cost sequence with sentence split in entity	2023-03-24 15:35:22 +01:00
test_neural_parser.py	Update config resolution to use new Thinc	2020-09-27 22:21:31 +02:00
test_nn_beam.py	Tidy up and auto-format	2021-01-05 13:41:53 +11:00
test_nonproj.py	Auto-format code with black (#10945 )	2022-06-10 13:21:33 +02:00
test_parse_navigate.py	Raise error if deps not provided with heads (#8335 )	2021-06-15 13:23:32 +02:00
test_parse.py	Update to use absolute imports in tests (#12372 )	2023-03-06 17:30:17 +01:00
test_preset_sbd.py	Support negative examples in partial NER annotations (#8106 )	2021-06-17 17:33:00 +10:00
test_space_attachment.py	Tidy up tests and docs	2020-09-21 20:43:54 +02:00
test_state.py	Tidy up and auto-format	2021-01-05 13:41:53 +11:00