spaCy

mirror of https://github.com/explosion/spaCy.git synced 2025-03-18 09:02:29 +03:00

History

Sofie 9a478b6db8 Clean up of char classes, few tokenizer fixes and faster default French tokenizer (#3293 ) * splitting up latin unicode interval * removing hyphen as infix for French * adding failing test for issue 1235 * test for issue #3002 which now works * partial fix for issue #2070 * keep the hyphen as infix for French (as it was) * restore french expressions with hyphen as infix (as it was) * added succeeding unit test for Issue #2656 * Fix issue #2822 with custom Italian exception * Fix issue #2926 by allowing numbers right before infix / * splitting up latin unicode interval * removing hyphen as infix for French * adding failing test for issue 1235 * test for issue #3002 which now works * partial fix for issue #2070 * keep the hyphen as infix for French (as it was) * restore french expressions with hyphen as infix (as it was) * added succeeding unit test for Issue #2656 * Fix issue #2822 with custom Italian exception * Fix issue #2926 by allowing numbers right before infix / * remove duplicate * remove xfail for Issue #2179 fixed by Matt * adjust documentation and remove reference to regex lib		2019-02-20 22:10:13 +01:00
..
lemmatizer	Merge branch 'master' into develop	2019-02-07 20:54:07 +01:00
__init__.py	💫 Tidy up and auto-format .py files (#2983 )	2018-11-30 17:03:03 +01:00
_tokenizer_exceptions_list.py	Clean up of char classes, few tokenizer fixes and faster default French tokenizer (#3293 )	2019-02-20 22:10:13 +01:00
examples.py	💫 Tidy up and auto-format .py files (#2983 )	2018-11-30 17:03:03 +01:00
lex_attrs.py	💫 Tidy up and auto-format .py files (#2983 )	2018-11-30 17:03:03 +01:00
punctuation.py	Replacing regex library with re to increase tokenization speed (#3218 )	2019-02-01 18:05:22 +11:00
stop_words.py	💫 Tidy up and auto-format .py files (#2983 )	2018-11-30 17:03:03 +01:00
syntax_iterators.py	💫 Tidy up and auto-format .py files (#2983 )	2018-11-30 17:03:03 +01:00
tag_map.py	💫 Tidy up and auto-format .py files (#2983 )	2018-11-30 17:03:03 +01:00
tokenizer_exceptions.py	Clean up of char classes, few tokenizer fixes and faster default French tokenizer (#3293 )	2019-02-20 22:10:13 +01:00