Commit Graph

71 Commits

Author SHA1 Message Date
Matthew Honnibal
82b8cc5efb Whitespace 2016-09-24 22:17:01 +02:00
Matthew Honnibal
082e95b19e Python 3 compatible basestring 2016-09-24 22:09:21 +02:00
Matthew Honnibal
fd65cf6cbb Finish refactoring data loading 2016-09-24 20:26:17 +02:00
Matthew Honnibal
83e364188c Mostly finished loading refactoring. Design is in place, but doesn't work yet. 2016-09-24 15:42:01 +02:00
Matthew Honnibal
478a8d1829 * Register Chinese language in spacy/__init__.py 2016-04-24 18:45:16 +02:00
Matthew Honnibal
8b4677d34d * Add missing keyword arguments to spacy.load() function 2016-04-17 21:31:50 +02:00
Henning Peters
f2d011c034 avoid polluting spacy namespace with lang classes 2016-04-12 16:31:16 +02:00
Henning Peters
c90d4a6f17 relative imports in __init__.py 2016-03-26 11:44:53 +01:00
Henning Peters
db095a162c fix 2016-03-25 18:59:47 +01:00
Henning Peters
b8f63071eb add lang registration facility 2016-03-25 18:54:45 +01:00
Henning Peters
a7d7ea3afa first idea for supporting multiple langs in download script 2016-03-24 11:19:43 +01:00
Henning Peters
eb7ae61b1c cleanup api 2016-03-08 12:59:18 +01:00
Henning Peters
aa4d964c14 cleanup api 2016-03-05 17:51:32 +01:00
Henning Peters
931c07a609 initial proposal for separate vector package 2016-03-04 11:09:06 +01:00
Henning Peters
846fa49b2a distinct load() and from_package() methods 2016-01-16 10:00:57 +01:00
Henning Peters
788f734513 refactored data_dir->via, add zip_safe, add spacy.load() 2016-01-15 18:01:02 +01:00
Matthew Honnibal
fdaf24604a * Basic punct tests updated and passing 2014-08-27 19:38:57 +02:00
Matthew Honnibal
01469b0888 * Refactor spacy so that chunks return arrays of lexemes, so that there is properly one lexeme per word. 2014-08-18 19:14:00 +02:00
Matthew Honnibal
365a2af756 * Restore happax. commit uncommited work 2014-08-02 21:27:03 +01:00
Matthew Honnibal
a895fe5ddb * Upd from spacy 2014-07-23 17:35:18 +01:00
Matthew Honnibal
556f6a18ca * Initial commit. Tests passing for punctuation handling. Need contractions, file transport, tokenize function, etc. 2014-07-05 20:51:42 +02:00