mirror of
https://github.com/explosion/spaCy.git
synced 2024-12-25 17:36:30 +03:00
81564cc4e8
* adding e-KTP in tokenizer exceptions list * add exception token * removing lines with containing space as it won't matter since we use .split() method in the end, added new tokens in exception * add tokenizer exceptions list * combining base_norms with norm_exceptions * adding norm_exception * fix double key in lemmatizer * remove unused import on punctuation.py * reformat stop_words to reduce number of lines, improve readibility * updating tokenizer exception * implement is_currency for lang/id * adding orth_first_upper in tokenizer_exceptions * update the norm_exception list * remove bunch of abbreviations * adding contributors file |
||
---|---|---|
.. | ||
__init__.py | ||
_tokenizer_exceptions_list.py | ||
examples.py | ||
lemmatizer.py | ||
lex_attrs.py | ||
norm_exceptions.py | ||
punctuation.py | ||
stop_words.py | ||
syntax_iterators.py | ||
tokenizer_exceptions.py |