mirror of
https://github.com/explosion/spaCy.git
synced 2025-10-30 15:37:29 +03:00
* adding e-KTP in tokenizer exceptions list * add exception token * removing lines with containing space as it won't matter since we use .split() method in the end, added new tokens in exception * add tokenizer exceptions list * combining base_norms with norm_exceptions * adding norm_exception * fix double key in lemmatizer * remove unused import on punctuation.py * reformat stop_words to reduce number of lines, improve readibility * updating tokenizer exception * implement is_currency for lang/id * adding orth_first_upper in tokenizer_exceptions * update the norm_exception list * remove bunch of abbreviations * adding contributors file |
||
|---|---|---|
| .. | ||
| __init__.py | ||
| _tokenizer_exceptions_list.py | ||
| examples.py | ||
| lemmatizer.py | ||
| lex_attrs.py | ||
| norm_exceptions.py | ||
| punctuation.py | ||
| stop_words.py | ||
| syntax_iterators.py | ||
| tokenizer_exceptions.py | ||