1
1
mirror of https://github.com/explosion/spaCy.git synced 2025-02-18 04:20:33 +03:00
spaCy/spacy/lang/it
Sofie 9a478b6db8 Clean up of char classes, few tokenizer fixes and faster default French tokenizer ()
* splitting up latin unicode interval

* removing hyphen as infix for French

* adding failing test for issue 1235

* test for issue  which now works

* partial fix for issue 

* keep the hyphen as infix for French (as it was)

* restore french expressions with hyphen as infix (as it was)

* added succeeding unit test for Issue 

* Fix issue  with custom Italian exception

* Fix issue  by allowing numbers right before infix /

* splitting up latin unicode interval

* removing hyphen as infix for French

* adding failing test for issue 1235

* test for issue  which now works

* partial fix for issue 

* keep the hyphen as infix for French (as it was)

* restore french expressions with hyphen as infix (as it was)

* added succeeding unit test for Issue 

* Fix issue  with custom Italian exception

* Fix issue  by allowing numbers right before infix /

* remove duplicate

* remove xfail for Issue  fixed by Matt

* adjust documentation and remove reference to regex lib
2019-02-20 22:10:13 +01:00
..
__init__.py Clean up of char classes, few tokenizer fixes and faster default French tokenizer () 2019-02-20 22:10:13 +01:00
examples.py 💫 Tidy up and auto-format .py files () 2018-11-30 17:03:03 +01:00
lemmatizer.py Fix syntax error in italian lemmatizer 2018-04-03 23:13:22 +02:00
punctuation.py Improve Italian & Urdu tokenization accuracy () 2019-02-04 22:39:25 +01:00
stop_words.py 💫 Tidy up and auto-format .py files () 2018-11-30 17:03:03 +01:00
tag_map.py 💫 Tidy up and auto-format .py files () 2018-11-30 17:03:03 +01:00
tokenizer_exceptions.py Clean up of char classes, few tokenizer fixes and faster default French tokenizer () 2019-02-20 22:10:13 +01:00