spaCy/spacy/lang/nl
adrianeboyd f4ef64a526
Improve tokenization for UD Dutch corpora (#5259)
* Improve tokenization for UD Dutch corpora

Improve tokenization for UD Dutch Alpino and LassySmall.

* Format Dutch tokenizer exceptions
2020-04-06 13:18:07 +02:00
..
__init__.py Improve tokenization for UD Dutch corpora (#5259) 2020-04-06 13:18:07 +02:00
examples.py Tidy up and auto-format 2019-08-20 17:36:34 +02:00
lemmatizer.py Refactor lemmatizer and data table integration (#4353) 2019-10-01 21:36:03 +02:00
lex_attrs.py Tidy up and auto-format 2019-08-20 17:36:34 +02:00
punctuation.py Improve tokenization for UD Dutch corpora (#5259) 2020-04-06 13:18:07 +02:00
stop_words.py Tidy up and auto-format 2019-08-20 17:36:34 +02:00
tag_map.py Tidy up and auto-format 2019-08-20 17:36:34 +02:00
tokenizer_exceptions.py Improve tokenization for UD Dutch corpora (#5259) 2020-04-06 13:18:07 +02:00