spaCy/spacy/lang
Lucas Abbade be7fdc59d1 Update lex_attrs.py (#2307)
* Update lex_attrs.py

Fixed spelling mistakes of some numbers (according to Brazilian Portuguese).

* Update lex_attrs.py

As requested, I've included the correct spelling for both Brazilian Portuguese and Portuguese Portuguese.

I will advise however, that the two are separated in the future. Brazilian Portuguese is a very different language from the original one, although most of the writing is unified, the way people talk in both countries is radically different. Keeping both languages as one may lead to bigger issues in the future, especially when it comes to spell checking.
2018-05-09 20:49:31 +02:00
..
bn Fix PRON_LEMMA imports 2017-11-06 17:41:53 +01:00
da Add Danish lemmatizer (#2184) 2018-04-07 19:07:28 +02:00
de Fix PRON_LEMMA imports 2017-11-06 17:41:53 +01:00
en quick typo fix 2018-03-24 17:26:35 +01:00
es Fix Spanish noun_chunks (resolves #2210) 2018-04-18 18:44:01 -04:00
fa add persian language 2018-01-27 13:27:26 +03:30
fi Tidy up tokenizer exceptions 2017-11-01 23:02:45 +01:00
fr Update stop_words.py for French language (#2310) 2018-05-09 12:04:38 +02:00
ga Remove comma that caused list to wrap in tuple! 2017-10-31 20:13:16 +01:00
he Don't make copies of language data components 2017-10-11 15:34:55 +02:00
hi remove no-break spaces from Hindi example (fixes #1750) 2017-12-20 11:35:30 -08:00
hr Update stop_words.py 2018-03-24 17:31:24 +01:00
hu Don't copy exception dicts if not necessary and tidy up 2017-10-31 21:05:29 +01:00
id Find lowercased forms of numeric words 2018-01-08 03:25:08 +01:00
it Fix syntax error in italian lemmatizer 2018-04-03 23:13:22 +02:00
ja Port Japanese mecab tokenizer from v1 (#2036) 2018-05-03 18:38:26 +02:00
nb Copied French syntax iterator to simplify future changes 2018-02-05 14:45:05 +01:00
nl Find lowercased forms of ordinal words, where possible 2018-01-08 03:28:50 +01:00
pl Merge pull request #2142 from jimregan/polish-more-tokens 2018-03-24 19:06:44 +01:00
pt Update lex_attrs.py (#2307) 2018-05-09 20:49:31 +02:00
ro Add Romanian and Croatian skeletons (experimental) 2017-11-01 23:04:28 +01:00
ru Add Russian example sentences (see #1107) 2018-02-01 20:09:40 +01:00
sv fixes #2238 (#2241) 2018-04-28 14:55:22 +02:00
th Don't copy exception dicts if not necessary and tidy up 2017-10-31 21:05:29 +01:00
tr Port over Turkish changes 2018-03-24 17:31:07 +01:00
vi Add support for Vietnamese in spaCy by leveraging Pyvi, an external Vietnamese tokenizer (#2155) 2018-03-29 12:19:51 +02:00
xx Tidy up language data 2017-10-11 02:22:49 +02:00
zh add ChineseDefaults class for pickling 2017-12-28 17:13:58 +08:00
__init__.py Remove imports in /lang/__init__.py 2017-05-08 23:58:07 +02:00
char_classes.py add ٪ as punctuation 2018-01-23 18:11:33 +03:30
entity_rules.py Reorganise entity rules 2017-05-09 01:37:10 +02:00
lex_attrs.py Merge pull request #1891 from fucking-signup/master 2018-02-18 13:47:47 +01:00
norm_exceptions.py Update base norm exceptions with more unicode characters 2017-10-14 14:58:52 +02:00
punctuation.py Add symbols class to punctuation rules to handle emoji (see #1088) 2017-05-27 17:57:10 +02:00
tag_map.py Fix formatting 2017-05-09 11:08:14 +02:00
tokenizer_exceptions.py Tidy up tokenizer exceptions 2017-11-01 23:02:45 +01:00