Commit Graph

3 Commits

Author SHA1 Message Date
Sofie Van Landeghem
492d1ec5de
Prevent alignment when texts don't match (#5867)
* remove empty gold.pyx

* add alignment unit test (to be used in docs)

* ensure that Alignment is only used on equal texts

* additional test using example.alignment

* formatting

Co-authored-by: Matthew Honnibal <honnibal+gh@gmail.com>
2020-08-04 16:29:18 +02:00
Ines Montani
e92df281ce Tidy up, autoformat, add types 2020-07-25 15:01:15 +02:00
Matthew Honnibal
cc477be952
Improve gold-standard alignment (#5711)
* Remove previous alignment

* Implement better alignment, using ragged data structure

* Use pytokenizations for alignment

* Fixes

* Fixes

* Fix overlapping entities in alignment

* Fix align split_sents

* Update test

* Commit align.py

* Try to appease setuptools

* Fix flake8

* use realistic entities for testing

* Update tests for better alignment

* Improve alignment heuristic

Co-authored-by: svlandeg <sofie.vanlandeghem@gmail.com>
2020-07-06 17:39:31 +02:00