Wolfgang Seeker
5e2e8e951a
add baseclass DocIterator for iterators over documents
...
add classes for English and German noun chunks
the respective iterators are set for the document when created by the parser
as they depend on the annotation scheme of the parsing model
2016-03-16 15:53:35 +01:00
Wolfgang Seeker
03fb498dbe
introduce lang field for LexemeC to hold language id
...
put noun_chunk logic into iterators.py for each language separately
2016-03-10 13:01:34 +01:00
Henning Peters
9cc4f8d5b3
avoid shadowing __name__
2016-02-15 01:33:39 +01:00
Matthew Honnibal
445164d5b4
* Restore the LOCAL_DATA_DIR global in spacy/en/__init__.py, although this is now deprecated
2016-01-19 02:54:56 +01:00
Henning Peters
5551052840
fix py2/3 issue
2016-01-16 12:44:53 +01:00
Henning Peters
235f094534
untangle data_path/via
2016-01-16 12:23:45 +01:00
Henning Peters
211913d689
add about.py, adapt setup.py
2016-01-15 18:57:01 +01:00
Henning Peters
780cb847c9
add default_model to about
2016-01-15 18:07:15 +01:00
Henning Peters
788f734513
refactored data_dir->via, add zip_safe, add spacy.load()
2016-01-15 18:01:02 +01:00
Henning Peters
9b75d872b0
fix model download
2016-01-14 12:02:56 +01:00
Matthew Honnibal
187960606f
* Fix pickle problems
2015-12-28 16:54:03 +01:00
Henning Peters
32d655b6e1
bump version
2015-12-28 09:34:39 +01:00
Matthew Honnibal
8b61d45ed0
* Fix merge conflicts for headers branch
2015-12-27 17:46:25 +01:00
Henning Peters
0e321a7105
get mingw32 to work
2015-12-22 23:25:38 +01:00
Henning Peters
8359bd4d93
strip data/ from package, friendlier Language invocation, make data_dir backward/forward-compatible
2015-12-18 09:52:55 +01:00
Henning Peters
970278a3d6
no need to link data dir anymore
2015-12-18 09:49:45 +01:00
Henning Peters
2d4efe40f9
fix sputnik call
2015-12-13 14:46:08 +01:00
Henning Peters
ac318b568c
new approach to dependency headers
2015-12-13 11:49:17 +01:00
Henning Peters
9027cef3bc
access model via sputnik
2015-12-07 06:01:28 +01:00
Henning Peters
73e5650be5
change index server
2015-11-18 18:09:46 +01:00
Henning Peters
50d15ea5d2
fix
2015-11-18 17:35:21 +01:00
Henning Peters
919a4f0b04
change data path, add repository
2015-11-18 11:40:46 +01:00
Henning Peters
12de895e60
fix version
2015-11-15 16:38:16 +01:00
Henning Peters
03d2f98cd5
add sputnik
2015-11-15 15:58:21 +01:00
Matthew Honnibal
3b74739c3e
* Download updated data
2015-11-08 21:24:25 +11:00
Matthew Honnibal
ffedff9e6c
* Remove the archive after download, to save disk space
2015-11-03 18:54:05 +11:00
Matthew Honnibal
ff4fe524ee
* Fix exception for python 2
2015-10-23 01:56:13 +02:00
Matthew Honnibal
341a3e85cd
* Upd downloaded data version
2015-10-23 00:56:57 +02:00
Henning Peters
ccffd2ef53
fixed extract directory
2015-10-21 07:59:34 +02:00
Henning Peters
da4c9cee06
assert filename match
2015-10-20 19:33:59 +02:00
Henning Peters
4f703f0cb4
better error reporting, cleanup
2015-10-20 19:11:29 +02:00
Matthew Honnibal
9cdea6e450
* Import uget correctly
2015-10-19 08:32:41 +02:00
Henning Peters
bfde91fa49
add custom download tool (uget), replace wget with uget
2015-10-18 12:35:04 +02:00
Matthew Honnibal
e886e6a406
* Inc version
2015-10-13 13:46:17 +11:00
Matthew Honnibal
a3dfe2b901
* Increment data version
2015-10-09 13:26:17 +02:00
Matthew Honnibal
b228a8f4a6
* Remove spacy/en/attrs
2015-10-06 16:20:46 +11:00
Matthew Honnibal
693677fd8d
* Prepare to remove en/attrx file, now that moving to symbols.pyx
2015-10-06 16:20:13 +11:00
Matthew Honnibal
ecc5281b36
* Remove en/pos.pyx, as the tagger code now lives in spacy/tagger.pyx
2015-10-06 10:12:08 +11:00
Robert
8711b64860
Force SSL for downloading English language data.
...
It would also be nice to have a checksum for this.
2015-09-21 17:26:01 -07:00
Matthew Honnibal
e13e47e9e5
* Add English stop words
2015-09-14 17:48:51 +10:00
Matthew Honnibal
0b7d2a6c62
* Inc version
2015-09-13 01:26:29 +02:00
Matthew Honnibal
e2ef78b29c
* Gut pos.pyx module, since functionality moved to spacy/tagger.pyx
2015-08-26 19:15:42 +02:00
Matthew Honnibal
c4d8754385
* Specify LOCAL_DATA_DIR global in spacy.en.__init__.py
2015-08-26 19:15:07 +02:00
Matthew Honnibal
c5a27d1821
* Move lemmatizer to spacy
2015-08-25 15:47:08 +02:00
Matthew Honnibal
82217c6ec6
* Generalize lemmatizer
2015-08-25 15:46:19 +02:00
Matthew Honnibal
8083a07c3e
* Use language base class
2015-08-25 15:37:30 +02:00
Matthew Honnibal
5dd76be446
* Split EnPosTagger up into base class and subclass
2015-08-24 05:25:55 +02:00
Matthew Honnibal
6f1743692a
* Work on language-independent refactoring
2015-08-23 20:49:18 +02:00
Matthew Honnibal
cad0cca4e3
* Tmp
2015-08-22 22:04:34 +02:00
Matthew Honnibal
5737115e1e
* Work on gazetteer matching
2015-08-06 14:33:21 +02:00