ines
|
ece30c28a8
|
Don't split hyphenated words in German
This way, the tokenizer matches the tokenization in German treebanks
|
2017-09-16 20:40:15 +02:00 |
|
Matthew Honnibal
|
d5fbf27335
|
Fix test
|
2017-09-04 16:45:11 +02:00 |
|
Matthew Honnibal
|
644d6c9e1a
|
Improve lemmatization tests, re #1296
|
2017-09-04 15:17:44 +02:00 |
|
Jim Geovedi
|
fbc62a09c7
|
added {pre,suf,in}fix tests
|
2017-08-20 13:43:00 +07:00 |
|
Jim Geovedi
|
cc4772cac2
|
reworks
|
2017-08-03 13:08:38 +07:00 |
|
Jim Geovedi
|
783f7d8b86
|
added test set for Indonesian language
|
2017-07-29 18:21:07 +07:00 |
|
ines
|
cc9c5dc7a3
|
Fix noun chunks test
|
2017-06-05 16:39:04 +02:00 |
|
ines
|
a0f4592f0a
|
Update tests
|
2017-06-05 02:26:13 +02:00 |
|
ines
|
3e105bcd36
|
Update tests
|
2017-06-05 02:09:27 +02:00 |
|
Matthew Honnibal
|
58be0e1f6f
|
Update tests
|
2017-06-04 16:35:06 -05:00 |
|
Ines Montani
|
112c5787eb
|
Merge pull request #1101 from oroszgy/hu_tokenizer_fix
More robust Hungarian tokenizer.
|
2017-06-04 22:37:51 +02:00 |
|
ines
|
e47eef5e03
|
Update German tokenizer exceptions and tests
|
2017-06-03 21:07:44 +02:00 |
|
ines
|
d77c2cc8bb
|
Add tests for English norm exceptions
|
2017-06-03 20:59:50 +02:00 |
|
Gyorgy Orosz
|
f0c3b09242
|
More robust Hungarian tokenizer.
|
2017-05-31 22:28:40 +02:00 |
|
ines
|
20a7003c0d
|
Update model fixtures and reorganise tests
|
2017-05-29 22:14:31 +02:00 |
|
ines
|
d0c6d4f76d
|
Fix formatting
|
2017-05-23 11:32:00 +02:00 |
|
ines
|
2c3bdd09b1
|
Add English test for like_num
|
2017-05-09 11:06:34 +02:00 |
|
ines
|
22375eafb0
|
Fix and merge attrs and lex_attrs tests
|
2017-05-09 11:06:25 +02:00 |
|
ines
|
c714841cc8
|
Move language-specific tests to tests/lang
|
2017-05-09 00:02:37 +02:00 |
|
ines
|
3c0f85de8e
|
Remove imports in /lang/__init__.py
|
2017-05-08 23:58:07 +02:00 |
|