spaCy

mirror of https://github.com/explosion/spaCy.git synced 2024-11-14 13:47:13 +03:00

History

Andrew Ongko 81564cc4e8 Update Indonesian model (#2752 ) * adding e-KTP in tokenizer exceptions list * add exception token * removing lines with containing space as it won't matter since we use .split() method in the end, added new tokens in exception * add tokenizer exceptions list * combining base_norms with norm_exceptions * adding norm_exception * fix double key in lemmatizer * remove unused import on punctuation.py * reformat stop_words to reduce number of lines, improve readibility * updating tokenizer exception * implement is_currency for lang/id * adding orth_first_upper in tokenizer_exceptions * update the norm_exception list * remove bunch of abbreviations * adding contributors file		2018-09-14 12:30:32 +02:00
..
__init__.py	Update Indonesian model (#2752 )	2018-09-14 12:30:32 +02:00
_tokenizer_exceptions_list.py	Update Indonesian model (#2752 )	2018-09-14 12:30:32 +02:00
examples.py	added examples	2017-08-20 11:57:10 +07:00
lemmatizer.py	Update Indonesian model (#2752 )	2018-09-14 12:30:32 +02:00
lex_attrs.py	Update Indonesian model (#2752 )	2018-09-14 12:30:32 +02:00
norm_exceptions.py	Update Indonesian model (#2752 )	2018-09-14 12:30:32 +02:00
punctuation.py	Update Indonesian model (#2752 )	2018-09-14 12:30:32 +02:00
stop_words.py	Update Indonesian model (#2752 )	2018-09-14 12:30:32 +02:00
syntax_iterators.py	wip syntax iterators	2017-07-27 10:51:34 +07:00
tokenizer_exceptions.py	Update Indonesian model (#2752 )	2018-09-14 12:30:32 +02:00