Commit Graph

182 Commits

Author SHA1 Message Date
Wolfgang Seeker
5e2e8e951a add baseclass DocIterator for iterators over documents
add classes for English and German noun chunks

the respective iterators are set for the document when created by the parser
as they depend on the annotation scheme of the parsing model
2016-03-16 15:53:35 +01:00
Wolfgang Seeker
03fb498dbe introduce lang field for LexemeC to hold language id
put noun_chunk logic into iterators.py for each language separately
2016-03-10 13:01:34 +01:00
Henning Peters
9cc4f8d5b3 avoid shadowing __name__ 2016-02-15 01:33:39 +01:00
Matthew Honnibal
445164d5b4 * Restore the LOCAL_DATA_DIR global in spacy/en/__init__.py, although this is now deprecated 2016-01-19 02:54:56 +01:00
Henning Peters
5551052840 fix py2/3 issue 2016-01-16 12:44:53 +01:00
Henning Peters
235f094534 untangle data_path/via 2016-01-16 12:23:45 +01:00
Henning Peters
211913d689 add about.py, adapt setup.py 2016-01-15 18:57:01 +01:00
Henning Peters
780cb847c9 add default_model to about 2016-01-15 18:07:15 +01:00
Henning Peters
788f734513 refactored data_dir->via, add zip_safe, add spacy.load() 2016-01-15 18:01:02 +01:00
Henning Peters
9b75d872b0 fix model download 2016-01-14 12:02:56 +01:00
Matthew Honnibal
187960606f * Fix pickle problems 2015-12-28 16:54:03 +01:00
Henning Peters
32d655b6e1 bump version 2015-12-28 09:34:39 +01:00
Matthew Honnibal
8b61d45ed0 * Fix merge conflicts for headers branch 2015-12-27 17:46:25 +01:00
Henning Peters
0e321a7105 get mingw32 to work 2015-12-22 23:25:38 +01:00
Henning Peters
8359bd4d93 strip data/ from package, friendlier Language invocation, make data_dir backward/forward-compatible 2015-12-18 09:52:55 +01:00
Henning Peters
970278a3d6 no need to link data dir anymore 2015-12-18 09:49:45 +01:00
Henning Peters
2d4efe40f9 fix sputnik call 2015-12-13 14:46:08 +01:00
Henning Peters
ac318b568c new approach to dependency headers 2015-12-13 11:49:17 +01:00
Henning Peters
9027cef3bc access model via sputnik 2015-12-07 06:01:28 +01:00
Henning Peters
73e5650be5 change index server 2015-11-18 18:09:46 +01:00
Henning Peters
50d15ea5d2 fix 2015-11-18 17:35:21 +01:00
Henning Peters
919a4f0b04 change data path, add repository 2015-11-18 11:40:46 +01:00
Henning Peters
12de895e60 fix version 2015-11-15 16:38:16 +01:00
Henning Peters
03d2f98cd5 add sputnik 2015-11-15 15:58:21 +01:00
Matthew Honnibal
3b74739c3e * Download updated data 2015-11-08 21:24:25 +11:00
Matthew Honnibal
ffedff9e6c * Remove the archive after download, to save disk space 2015-11-03 18:54:05 +11:00
Matthew Honnibal
ff4fe524ee * Fix exception for python 2 2015-10-23 01:56:13 +02:00
Matthew Honnibal
341a3e85cd * Upd downloaded data version 2015-10-23 00:56:57 +02:00
Henning Peters
ccffd2ef53 fixed extract directory 2015-10-21 07:59:34 +02:00
Henning Peters
da4c9cee06 assert filename match 2015-10-20 19:33:59 +02:00
Henning Peters
4f703f0cb4 better error reporting, cleanup 2015-10-20 19:11:29 +02:00
Matthew Honnibal
9cdea6e450 * Import uget correctly 2015-10-19 08:32:41 +02:00
Henning Peters
bfde91fa49 add custom download tool (uget), replace wget with uget 2015-10-18 12:35:04 +02:00
Matthew Honnibal
e886e6a406 * Inc version 2015-10-13 13:46:17 +11:00
Matthew Honnibal
a3dfe2b901 * Increment data version 2015-10-09 13:26:17 +02:00
Matthew Honnibal
b228a8f4a6 * Remove spacy/en/attrs 2015-10-06 16:20:46 +11:00
Matthew Honnibal
693677fd8d * Prepare to remove en/attrx file, now that moving to symbols.pyx 2015-10-06 16:20:13 +11:00
Matthew Honnibal
ecc5281b36 * Remove en/pos.pyx, as the tagger code now lives in spacy/tagger.pyx 2015-10-06 10:12:08 +11:00
Robert
8711b64860 Force SSL for downloading English language data.
It would also be nice to have a checksum for this.
2015-09-21 17:26:01 -07:00
Matthew Honnibal
e13e47e9e5 * Add English stop words 2015-09-14 17:48:51 +10:00
Matthew Honnibal
0b7d2a6c62 * Inc version 2015-09-13 01:26:29 +02:00
Matthew Honnibal
e2ef78b29c * Gut pos.pyx module, since functionality moved to spacy/tagger.pyx 2015-08-26 19:15:42 +02:00
Matthew Honnibal
c4d8754385 * Specify LOCAL_DATA_DIR global in spacy.en.__init__.py 2015-08-26 19:15:07 +02:00
Matthew Honnibal
c5a27d1821 * Move lemmatizer to spacy 2015-08-25 15:47:08 +02:00
Matthew Honnibal
82217c6ec6 * Generalize lemmatizer 2015-08-25 15:46:19 +02:00
Matthew Honnibal
8083a07c3e * Use language base class 2015-08-25 15:37:30 +02:00
Matthew Honnibal
5dd76be446 * Split EnPosTagger up into base class and subclass 2015-08-24 05:25:55 +02:00
Matthew Honnibal
6f1743692a * Work on language-independent refactoring 2015-08-23 20:49:18 +02:00
Matthew Honnibal
cad0cca4e3 * Tmp 2015-08-22 22:04:34 +02:00
Matthew Honnibal
5737115e1e * Work on gazetteer matching 2015-08-06 14:33:21 +02:00