1
1
mirror of https://github.com/explosion/spaCy.git synced 2025-02-20 13:30:34 +03:00
Commit Graph

3 Commits

Author SHA1 Message Date
adrianeboyd
f4ef64a526
Improve tokenization for UD Dutch corpora ()
* Improve tokenization for UD Dutch corpora

Improve tokenization for UD Dutch Alpino and LassySmall.

* Format Dutch tokenizer exceptions
2020-04-06 13:18:07 +02:00
Ines Montani
f580302673 Tidy up and auto-format 2019-08-20 17:36:34 +02:00
Yves Peirsman
951825532c Improved Dutch language resources and Dutch lemmatization ()
* Improved Dutch language resources and Dutch lemmatization

* Fix conftest

* Update punctuation.py

* Auto-format

* Format and fix tests

* Remove unused test file

* Re-add deleted test

* removed redundant infix regex pattern for ','; note: brackets + simple hyphen remains

* Cleaner lemmatization files
2019-04-03 14:13:26 +02:00