spaCy/spacy/tests/lang/ca
Adriane Boyd b98d216205
Update Catalan language data (#8308)
* Update Catalan language data

Update Catalan language data based on contributions from the Text Mining
Unit at the Barcelona Supercomputing Center:

https://github.com/TeMU-BSC/spacy4release/tree/main/lang_data

* Update tokenizer settings for UD Catalan AnCora

Update for UD Catalan AnCora v2.7 with merged multi-word tokens.

* Update test

* Move prefix patternt to more generic infix pattern

* Clean up
2021-06-11 10:21:22 +02:00
..
__init__.py Revert #4334 2019-09-29 17:32:12 +02:00
test_exception.py Remove POS, TAG and LEMMA from tokenizer exceptions 2020-07-22 23:09:01 +02:00
test_prefix_suffix_infix.py Drop Python 2.7 and 3.5 (#4828) 2019-12-22 01:53:56 +01:00
test_text.py Update Catalan language data (#8308) 2021-06-11 10:21:22 +02:00