Commit Graph

3 Commits

Author SHA1 Message Date
Jacobo Myerston
4f8daa4f00
Add Left and Right Pointing Angle Brackets as punctuation to ancient Greek (#12829)
* Update universe.json

* Update universe.json

add some missing commas in the greCy's description.

* Update punctuation.py

Add mathematical left and right angle brackets as punctuation for ancient Greek for better tokenization.
2023-07-20 11:16:01 +02:00
Daniël de Kok
e2b70df012
Configure isort to use the Black profile, recursively isort the spacy module (#12721)
* Use isort with Black profile

* isort all the things

* Fix import cycles as a result of import sorting

* Add DOCBIN_ALL_ATTRS type definition

* Add isort to requirements

* Remove isort from build dependencies check

* Typo
2023-06-14 17:48:41 +02:00
Jacobo Myerston
3e8bc1272f
add punctuation to grc (#11426)
* add punctuation to grc

Add support for special editorial punctuation that is common in ancient Greek texts.  Ancient Greek texts, as found in digital and print form, have been largely edited by scholars. Restorations and improvements are normally marked with special characters that need to be handled properly by the tokenizer.

* add unit tests

* simplify regex

* move generic quotes to char classes

* rename unit test

* fix regex

Co-authored-by: Adriane Boyd <adrianeboyd@gmail.com>

Co-authored-by: svlandeg <svlandeg@github.com>
Co-authored-by: Sofie Van Landeghem <svlandeg@users.noreply.github.com>
Co-authored-by: Adriane Boyd <adrianeboyd@gmail.com>
2022-09-27 11:38:56 +02:00