Filipe Caixeta
6c498f9ff4
Update Portuguese Language ( #2790 )
...
* Add words to portuguese language _num_words
* Add words to portuguese language _num_words
* Portuguese - Add/remove stopwords, fix tokenizer, add currency symbols
* Extended punctuation and norm_exceptions in the Portuguese language
2018-09-29 09:51:45 +02:00
ines
ef597622a6
Add Portuguese tag map
2017-11-05 13:58:34 +01:00
ines
8ce6f96180
Don't make copies of language data components
2017-10-11 15:34:55 +02:00
ines
417d45f5d0
Add lemmatizer data as variable on language data
...
Don't create lookup lemmatizer within Language class and just pass in
the data so it can be set on Token creation
2017-10-11 02:24:58 +02:00
ines
0c2343d73a
Tidy up language data
2017-10-11 02:22:49 +02:00
ines
4c643d74c5
Add norm exceptions to other Language classes
2017-06-03 22:29:21 +02:00
ines
924e8506de
Move Defaults subclass to module scope (necessary for pickling)
2017-05-20 19:02:27 +02:00
ines
c13b3fa052
Add LEX_ATTRS
2017-05-12 15:37:45 +02:00
ines
73b577cb01
Fix relative imports
2017-05-08 22:29:04 +02:00
ines
f46ffe3e89
Move language data to /lang module
2017-05-08 20:00:40 +02:00