Commit Graph

5731 Commits

Author SHA1 Message Date
Matthew Honnibal
57c4341453 Refactor loading of morphology exceptions, adding a method add_special_case. 2016-12-18 14:59:44 +01:00
Ines Montani
77cf2fb0f6 Remove unnecessary argument in test 2016-12-18 14:06:27 +01:00
Ines Montani
121c310566 Remove trailing whitespace 2016-12-18 14:06:27 +01:00
Matthew Honnibal
46e98ec029 Move init_model.py script from repo. These meta-tools should live elsewhere 2016-12-18 14:03:40 +01:00
Matthew Honnibal
d5840c488b Clean unused code from fabfile 2016-12-18 13:53:30 +01:00
Ines Montani
0fc4e45cb3 Fix tag map for German 2016-12-18 13:30:03 +01:00
Ines Montani
28326649f3 Fix typo 2016-12-18 13:30:03 +01:00
Matthew Honnibal
0595cc0635 Change test595 to mock data, instead of requiring model. 2016-12-18 13:28:51 +01:00
Matthew Honnibal
a4eb5c2bff Check POS key in lemmatizer, to update it for new data format 2016-12-18 13:28:20 +01:00
Matthew Honnibal
28d63ec58e Restore missing '' character in tokenizer exceptions. 2016-12-18 05:34:51 +01:00
Ines Montani
a9421652c9 Remove duplicates in tag map 2016-12-17 22:44:31 +01:00
Ines Montani
69baf1c9a8 Fix tag map 2016-12-17 22:44:22 +01:00
Ines Montani
577adad945 Fix formatting 2016-12-17 14:00:52 +01:00
Ines Montani
fc4ad17136 Fix typo 2016-12-17 14:00:47 +01:00
Ines Montani
bb94e784dc Fix typo 2016-12-17 13:59:30 +01:00
Ines Montani
afda532595 Use symbols in tag map 2016-12-17 13:56:24 +01:00
Ines Montani
07249145c9 Fix formatting 2016-12-17 13:34:46 +01:00
Ines Montani
dd55d085b6 Reformat dutch language data to match new style 2016-12-17 13:26:01 +01:00
Ines Montani
f2c48ef504 Resolve stopwords conflict to merge Dutch 2016-12-17 13:08:16 +01:00
Ines Montani
3dded56ae1 Add contributors from #688 2016-12-17 12:52:57 +01:00
Matthew Honnibal
ff03ade08f Merge pull request #688 from nlesc-sherlock/dutch
Support for Dutch in SpaCy
2016-12-17 22:44:58 +11:00
Ines Montani
a22322187f Add missing lemmas to tokenizer exceptions (fixes #674) 2016-12-17 12:42:41 +01:00
Ines Montani
5445074cbd Expand tokenizer exceptions with unicode apostrophe (fixes #685) 2016-12-17 12:34:08 +01:00
Ines Montani
e0a7b5c612 Fix formatting 2016-12-17 12:33:09 +01:00
Ines Montani
08162dce67 Move shared functions and constants to global language data 2016-12-17 12:32:48 +01:00
Ines Montani
6a60a61086 Move update_exc to global language data utils 2016-12-17 12:29:02 +01:00
Ines Montani
f324311249 Add global language data utils 2016-12-17 12:27:41 +01:00
Ines Montani
487ce1e20a Add encoding declaration 2016-12-17 12:25:44 +01:00
Ines Montani
d8d50a0334 Add tokenizer exception for "gonna" (fixes #691) 2016-12-17 11:59:28 +01:00
Ines Montani
c69b77d8aa Revert "Add exception for "gonna""
This reverts commit 280c03f67b.
2016-12-17 11:56:44 +01:00
Ines Montani
280c03f67b Add exception for "gonna" 2016-12-17 11:54:59 +01:00
Ines Montani
56b8e8446f Merge pull request #690 from davedwards/patch-1
Update index.jade
2016-12-15 23:55:50 +01:00
David Edwards
278199dd2c Update index.jade 2016-12-15 13:40:53 -08:00
Ines Montani
5031a015e2 Fix typo in stopwords (fixes #689) 2016-12-15 17:57:06 +01:00
dafnevk
af761fd664 Signed Contributer Agreement by Rob van Nieuwpoort 2016-12-15 10:34:19 +01:00
dafnevk
cdf5dcc40a fixed bug in init_model so that it runs for dutch 2016-12-13 14:33:44 +01:00
Janneke van der Zwaan
4a3fdcce8a Merge github.com:explosion/spaCy into dutch 2016-12-13 09:25:23 +01:00
Matthew Honnibal
3b72fee624 Merge pull request #680 from savvopoulos/train-ner-update
actually commit load_ner.py
2016-12-13 07:29:25 +11:00
Christos Savvopoulos
c19b83f6ae use model_dir inside of load_model 2016-12-12 20:23:24 +00:00
Christos Savvopoulos
93cf4af701 actually commit load_ner.py 2016-12-12 20:13:33 +00:00
Matthew Honnibal
c4d9ea1186 Merge pull request #679 from savvopoulos/train-ner-update
train_ner should save vocab; add load_ner example
2016-12-13 07:13:30 +11:00
Christos Savvopoulos
ad54a929f8 train_ner should save vocab; add load_ner example 2016-12-12 20:09:49 +00:00
Matthew Honnibal
bf59420b1f Merge pull request #677 from explosion/revert-676-patch-2
Revert "Add acl to symbols.pyx"
2016-12-12 10:15:13 +11:00
Matthew Honnibal
5965d3c2a7 Revert "Add acl to symbols.pyx" 2016-12-12 10:10:28 +11:00
Matthew Honnibal
6dee76dfed Update symbols.pxd 2016-12-12 10:09:58 +11:00
Ines Montani
fe10f9c702 Merge pull request #676 from pokey/patch-2
Add acl to symbols.pyx
2016-12-11 22:06:13 +01:00
Pokey Rule
18a15c0777 Add acl to symbols.pyx 2016-12-11 20:00:07 +00:00
Gyorgy Orosz
0cf2144d24 Adding partial hyphen and quote handling support. 2016-12-11 00:14:36 +01:00
Gyorgy Orosz
2051726fd3 Passing Hungatian abbrev tests. 2016-12-10 23:37:58 +01:00
Ines Montani
61783c5025 Merge pull request #675 from jaspb/patch-1
added 'en' to spacy.load(..)
2016-12-10 20:25:46 +01:00