spaCy/spacy/tests/training
Adriane Boyd 1c4df8fd09
Replace pytokenizations with internal alignment (#6293)
* Replace pytokenizations with internal alignment

Replace pytokenizations with internal alignment algorithm that is
restricted to only allow differences in whitespace and capitalization.

* Rename `spacy.training.align` to `spacy.training.alignment` to contain
the `Alignment` dataclass
* Implement `get_alignments` in `spacy.training.align`

* Refactor trailing whitespace handling

* Remove unnecessary exception for empty docs

Allow a non-empty whitespace-only doc to be aligned with an empty doc

* Remove empty docs exceptions completely
2020-11-03 16:24:38 +01:00
..
__init__.py move tests to correct subdir 2020-09-15 21:40:38 +02:00
test_augmenters.py Update data augmenters (#6196) 2020-10-04 17:46:29 +02:00
test_new_example.py Refactor Token morph setting (#6175) 2020-10-01 22:21:46 +02:00
test_readers.py TextCat updates and fixes (#6263) 2020-10-18 14:50:41 +02:00
test_training.py Replace pytokenizations with internal alignment (#6293) 2020-11-03 16:24:38 +01:00