ines
|
417d45f5d0
|
Add lemmatizer data as variable on language data
Don't create lookup lemmatizer within Language class and just pass in
the data so it can be set on Token creation
|
2017-10-11 02:24:58 +02:00 |
|
ines
|
0c2343d73a
|
Tidy up language data
|
2017-10-11 02:22:49 +02:00 |
|
ines
|
ece30c28a8
|
Don't split hyphenated words in German
This way, the tokenizer matches the tokenization in German treebanks
|
2017-09-16 20:40:15 +02:00 |
|
ines
|
1fe5e1a4d1
|
Add language example sentences (see #1107)
da, de, en, es, fr, he, it, nb, pl, pt, sv
|
2017-08-19 12:22:29 +02:00 |
|
Matthew Honnibal
|
e28f90b672
|
Fix syntax iterators
|
2017-06-04 15:51:50 -05:00 |
|
ines
|
fa7e576c57
|
Change order of exception dicts
|
2017-06-03 21:52:06 +02:00 |
|
ines
|
e47eef5e03
|
Update German tokenizer exceptions and tests
|
2017-06-03 21:07:44 +02:00 |
|
ines
|
0d6fa8b241
|
Add German norm exceptions
|
2017-06-03 20:54:18 +02:00 |
|
ines
|
924e8506de
|
Move Defaults subclass to module scope (necessary for pickling)
|
2017-05-20 19:02:27 +02:00 |
|
ines
|
1a05078c79
|
Add language-specific syntax iterators to en and de
|
2017-05-17 12:04:03 +02:00 |
|
ines
|
73b577cb01
|
Fix relative imports
|
2017-05-08 22:29:04 +02:00 |
|
ines
|
f46ffe3e89
|
Move language data to /lang module
|
2017-05-08 20:00:40 +02:00 |
|