..
bn
Don't make copies of language data components
2017-10-11 15:34:55 +02:00
da
Don't make copies of language data components
2017-10-11 15:34:55 +02:00
de
Rename 'SP' special tag to '_SP'
2017-10-20 14:01:12 +02:00
en
Rename 'SP' special tag to '_SP'
2017-10-20 14:01:12 +02:00
es
Rename 'SP' special tag to '_SP'
2017-10-20 14:01:12 +02:00
fi
Don't make copies of language data components
2017-10-11 15:34:55 +02:00
fr
Don't make copies of language data components
2017-10-11 15:34:55 +02:00
he
Don't make copies of language data components
2017-10-11 15:34:55 +02:00
hi
Add Language class, stop words and basic stemmer that sets NORM
2017-10-14 14:59:52 +02:00
hu
Add Hungarian examples (see #1107 )
2017-10-17 02:37:45 +02:00
id
Don't make copies of language data components
2017-10-11 15:34:55 +02:00
it
Don't make copies of language data components
2017-10-11 15:34:55 +02:00
ja
Port over changes from #1157
2017-10-14 13:11:39 +02:00
nb
Don't make copies of language data components
2017-10-11 15:34:55 +02:00
nl
Don't make copies of language data components
2017-10-11 15:34:55 +02:00
pl
Don't make copies of language data components
2017-10-11 15:34:55 +02:00
pt
Don't make copies of language data components
2017-10-11 15:34:55 +02:00
sv
Don't make copies of language data components
2017-10-11 15:34:55 +02:00
th
Rename 'SP' special tag to '_SP'
2017-10-20 14:01:12 +02:00
xx
Tidy up language data
2017-10-11 02:22:49 +02:00
zh
fixed SyntaxError while checking for jieba
2017-10-16 18:51:33 +05:30
__init__.py
Remove imports in /lang/__init__.py
2017-05-08 23:58:07 +02:00
char_classes.py
Update base punctuation
2017-10-14 14:59:23 +02:00
entity_rules.py
Reorganise entity rules
2017-05-09 01:37:10 +02:00
lex_attrs.py
Make lex attr functions top-level functions, to promote pickling
2017-10-17 18:19:18 +02:00
norm_exceptions.py
Update base norm exceptions with more unicode characters
2017-10-14 14:58:52 +02:00
punctuation.py
Add symbols class to punctuation rules to handle emoji (see #1088 )
2017-05-27 17:57:10 +02:00
tag_map.py
Fix formatting
2017-05-09 11:08:14 +02:00
tokenizer_exceptions.py
Port over URL pattern changes from #1411
2017-10-14 12:58:07 +02:00