* Update Makefile
For more recent python version
* updated for bsc changes
New tokenization changes
* Update test_text.py
* updating tests and requirements
* changed failed test in test/lang/ca
changed failed test in test/lang/ca
* Update .gitignore
deleted stashed changes line
* back to python 3.6 and remove transformer requirements
As per request
* Update test_exception.py
Change the test
* Update test_exception.py
Remove test print
* Update Makefile
For more recent python version
* updated for bsc changes
New tokenization changes
* updating tests and requirements
* Update requirements.txt
Removed spacy-transfromers from requirements
* Update test_exception.py
Added final punctuation to ensure consistency
* Update Makefile
Co-authored-by: Sofie Van Landeghem <svlandeg@users.noreply.github.com>
* Format
* Update test to check all tokens
Co-authored-by: cayorodriguez <crodriguezp@gmail.com>
Co-authored-by: Sofie Van Landeghem <svlandeg@users.noreply.github.com>
## Description
1. Added the same infix rule as in French (`d'une`, `j'ai`) for Italian (`c'è`, `l'ha`), bringing F-score on `it_isdt-ud-train.txt` from 96% to 99%. Added unit test to check this behaviour.
2. Added specific Urdu punctuation character as suffix, improving F-score on `ur_udtb-ud-train.txt` from 94% to 100%. Added unit test to check this behaviour.
### Types of change
Enhancement of Italian & Urdu tokenization
## Checklist
- [x] I have submitted the spaCy Contributor Agreement.
- [x] I ran the tests, and all new and existing tests passed.
- [x] My changes don't require a change to the documentation, or if they do, I've added all required information.