Commit Graph

438 Commits

Author SHA1 Message Date
Henning Peters
9cc4f8d5b3 avoid shadowing __name__ 2016-02-15 01:33:39 +01:00
Henning Peters
4c9e3c7911 upgrade spuntik, enforce data api via model version constraints 2016-02-14 16:03:17 +01:00
Henning Peters
3b5f1e753b py26 compatibility 2016-02-10 14:32:54 +01:00
Henning Peters
c00dd43fe0 add sun data 2016-02-09 16:42:55 +01:00
Matthew Honnibal
860fd11e98 * Don't import include files --- use the repository 2016-02-06 23:59:47 +01:00
Matthew Honnibal
8bd16ce8f7 * Try to fix win32 compilation 2016-02-05 14:43:52 +01:00
Matthew Honnibal
add8f07f61 * Conditionally link against openmp, on not-darwin 2016-02-05 12:19:51 +01:00
Matthew Honnibal
c9aa91041d * Don't expect openmp in options 2016-02-02 13:50:25 +01:00
Matthew Honnibal
490ba65398 * Use openmp in parser 2016-02-01 03:08:42 +01:00
Matthew Honnibal
9c34ca9e5d * Add _stack to mod_names 2016-02-01 03:00:53 +01:00
Matthew Honnibal
bc0f0d284c * Require different thinc version 2016-01-30 20:29:24 +01:00
Henning Peters
65aeac24cb remove package version constraint 2016-01-21 17:40:51 +01:00
Henning Peters
211913d689 add about.py, adapt setup.py 2016-01-15 18:57:01 +01:00
Henning Peters
ccd87ad7fb add default_model to about 2016-01-15 18:12:01 +01:00
Henning Peters
780cb847c9 add default_model to about 2016-01-15 18:07:15 +01:00
Henning Peters
788f734513 refactored data_dir->via, add zip_safe, add spacy.load() 2016-01-15 18:01:02 +01:00
Henning Peters
bc229790ac integrate with sputnik 2016-01-13 19:46:17 +01:00
Matthew Honnibal
e38205a838 * Pin versions to ranges, to escape version lock 2015-12-31 02:09:55 +01:00
Henning Peters
1c4352c42e bump version 2015-12-28 13:53:26 +01:00
Henning Peters
a404bfec38 bump preshed version 2015-12-22 22:38:25 +01:00
Henning Peters
46fe3a7327 bump thinc version 2015-12-22 13:21:24 +01:00
Henning Peters
1643e63c31 bump preshed version 2015-12-22 11:23:25 +01:00
Henning Peters
4a1d843682 bump murmurhash version 2015-12-21 21:59:11 +01:00
Henning Peters
74dc02a0e6 fix windows readme 2015-12-21 21:58:53 +01:00
Henning Peters
c17ce6c119 (re-)include cython sources, murmurhash header discovery 2015-12-21 12:40:44 +01:00
Henning Peters
b667020e81 refactor setup.py 2015-12-13 23:39:29 +01:00
Henning Peters
4f4b1d8f3d refactor setup.py 2015-12-13 23:32:23 +01:00
Henning Peters
eaadca2bf2 get buildbot running 2015-12-13 14:13:46 +01:00
Henning Peters
73674a4afb try using system-wide headers 2015-12-13 12:51:23 +01:00
Henning Peters
b2f66f7b8d try using system-wide headers 2015-12-13 12:45:30 +01:00
Henning Peters
63d74ae8f3 try using system-wide headers 2015-12-13 12:41:46 +01:00
Henning Peters
92fabd0114 wrap virtualenv around cythonize 2015-12-13 12:32:22 +01:00
Henning Peters
ac318b568c new approach to dependency headers 2015-12-13 11:49:17 +01:00
Matthew Honnibal
65413ad7b3 Merge pull request #186 from henningpeters/master
website build was broken for me, fixed it
2015-11-29 15:36:52 +11:00
Henning Peters
abe6162e7b avoid redirect 2015-11-24 20:01:43 +01:00
Henning Peters
4e98ea4e41 bump version 2015-11-21 19:04:57 +01:00
Matthew Honnibal
d8c52560d1 Merge branch 'master' of ssh://github.com/honnibal/spaCy 2015-11-19 11:00:11 +01:00
Matthew Honnibal
44e563d4e5 * Pin version of murmurhash 2015-11-19 10:59:51 +01:00
Matthew Honnibal
73d47c3010 Merge pull request #185 from henningpeters/sputnik
integrate sputnik
2015-11-19 20:59:09 +11:00
Matthew Honnibal
1e166eb9cd * Upgrade spacy version 2015-11-18 17:42:56 +01:00
Henning Peters
919a4f0b04 change data path, add repository 2015-11-18 11:40:46 +01:00
Henning Peters
12de895e60 fix version 2015-11-15 16:38:16 +01:00
Matthew Honnibal
6dd37c5ee4 * Fix requirement of preshed 2015-11-08 18:09:21 +01:00
Matthew Honnibal
f9d20b1318 * Require updated thinc 2015-11-08 21:32:21 +11:00
Matthew Honnibal
3c162dcac3 * Refactor away from the _ml module, to use thinc 4.0. Still some work needs to be done, e.g. to add __reduce__ to the models, more testing, etc. 2015-11-07 03:24:30 +11:00
Matthew Honnibal
c339783bbe * Fix reference to tests.span in setup 2015-11-07 03:23:14 +11:00
Matthew Honnibal
802ad3d71a * Avoid compiling theano module for now 2015-11-06 00:24:43 +11:00
Matthew Honnibal
3ddea19b2b * Rename spans.pyx to span.pyx 2015-11-04 00:14:40 +11:00
Matthew Honnibal
9482d616bc * Rename spans.pyx to span.pyx 2015-11-03 23:51:05 +11:00
Matthew Honnibal
f81389abe0 * Pin to specific cymem, preshed and thinc versions. 2015-11-03 23:12:13 +11:00
Matthew Honnibal
7adef3f831 * Increment version 2015-11-03 07:58:59 +01:00
Matthew Honnibal
64531d5a3a * Define package_data in one place 2015-11-03 17:07:43 +11:00
Matthew Honnibal
5ca31e05fb * Prune down package data, as models are distributed entirely within the data download. 2015-11-03 13:30:37 +11:00
Matthew Honnibal
f56209ef2e * Update requirements 2015-11-03 02:40:01 +11:00
Matthew Honnibal
09e0b15629 * Package tests, for distriution in PyPi 2015-10-26 00:30:33 +11:00
Matthew Honnibal
b0ba534d4a * Fix license descriptor in setup.py 2015-10-26 00:16:37 +11:00
Matthew Honnibal
9ee1ddab7e * Increment version 2015-10-23 02:04:48 +02:00
Matthew Honnibal
108138366f * Ensure .pxd files are packaged 2015-10-23 01:57:03 +02:00
Matthew Honnibal
2348a08481 * Load/dump strings with a json file, instead of the hacky strings file we were using. 2015-10-22 21:13:03 +11:00
Matthew Honnibal
579670e4c7 * Fix uget 2015-10-19 17:23:33 +11:00
Matthew Honnibal
984775e5e2 * Fix setup of uget 2015-10-19 17:19:05 +11:00
Matthew Honnibal
e25adce54d Merge branch 'master' of ssh://github.com/honnibal/spaCy 2015-10-19 17:17:33 +11:00
Matthew Honnibal
382cbc8cab * Add uget to setup.py 2015-10-19 17:15:40 +11:00
Matthew Honnibal
a43777cef8 * Inc version 2015-10-19 07:46:42 +02:00
Henning Peters
bfde91fa49 add custom download tool (uget), replace wget with uget 2015-10-18 12:35:04 +02:00
Matthew Honnibal
fc261195f7 * Fix compilation for OSX 2015-10-18 17:19:07 +11:00
Matthew Honnibal
710e8fb168 * Fix platform condition re Issue #138 2015-10-15 20:46:08 +11:00
maxirmx
1b8fd329b8 Merge remote-tracking branch 'refs/remotes/honnibal/master' 2015-10-13 11:28:17 +03:00
Matthew Honnibal
d74a1e51d7 * Add cloudpickle requirement 2015-10-13 19:05:20 +11:00
maxirmx
3dbec0902f Merge remote-tracking branch 'refs/remotes/honnibal/master'
Conflicts -- pushing preshed 0.42
	requirements.txt
	setup.py
2015-10-13 10:16:16 +03:00
maxirmx
237db7f519 Appveyor build #5
Added Wordnet download
2015-10-13 10:11:56 +03:00
Matthew Honnibal
41cbbdefe3 Merge branch 'attrs' 2015-10-13 05:03:25 +02:00
Matthew Honnibal
1ca1beff4b * Allow preshed v0.42 in setup.py 2015-10-13 13:55:50 +11:00
Matthew Honnibal
b866f1443e Merge branch 'master' of https://github.com/honnibal/spaCy into attrs 2015-10-13 04:52:27 +02:00
Matthew Honnibal
6c2da06c18 * Package tag_map.json 2015-10-13 13:52:10 +11:00
Matthew Honnibal
e886e6a406 * Inc version 2015-10-13 13:46:17 +11:00
maxirmx
bf963c3cce Merging Windows\Linux versions of setup.py
Python 3.0 compatibility fix
2015-10-13 02:11:21 +03:00
maxirmx
ccf6156261 Merging Windows\Linux versions of setup.py #2 2015-10-13 01:46:52 +03:00
maxirmx
7c5bfc5916 Merging Windows/Linux versions of setup.py 2015-10-13 01:31:59 +03:00
maxirmx
9d949c857b More dirty Windows stuff - just for now 2015-10-10 20:11:20 +03:00
maxirmx
8e03239ac5 Merge remote-tracking branch 'refs/remotes/honnibal/master'
Conflicts:
	setup.py
2015-10-10 17:38:06 +03:00
maxirmx
815994a212 MSVC x86-64 Pyton 2.7 dirty build 2015-10-10 17:32:44 +03:00
Matthew Honnibal
064bd69ad0 * Refactor symbols, so that frequency rank can be derived from the orth id of a word. 2015-10-10 16:03:48 +11:00
Matthew Honnibal
8b8d048385 Merge pull request #135 from henningpeters/patch-1
remove compile warning noise
2015-10-10 01:40:15 +11:00
Matthew Honnibal
af8d0a2a09 * Increment version 2015-10-09 12:42:41 +02:00
Henning Peters
0e13f18ea4 remove compile warning noise 2015-10-09 07:23:39 +02:00
Matthew Honnibal
4513bed175 * Avoid compiling unused files 2015-10-08 14:00:34 +11:00
Matthew Honnibal
e562f504ee * Fix license metadata in setup.py 2015-09-29 23:02:37 +10:00
Matthew Honnibal
361f6fdd74 * Inc version 2015-09-22 02:22:27 +02:00
Matthew Honnibal
c3dea8bc8b * Inc version 2015-09-21 10:58:11 +02:00
Matthew Honnibal
0b7d2a6c62 * Inc version 2015-09-13 01:26:29 +02:00
Matthew Honnibal
c301bebd33 Merge branch 'master' of https://github.com/honnibal/spaCy into develop 2015-09-09 10:55:39 +02:00
Vsevolod Solovyov
39cfe28f33 Correctly pass link_args in c_ext() in setup.py 2015-08-24 12:52:05 +03:00
Matthew Honnibal
5dd76be446 * Split EnPosTagger up into base class and subclass 2015-08-24 05:25:55 +02:00
Matthew Honnibal
47db3067a0 * Compile spacy.matcher 2015-08-05 23:48:11 +02:00
Matthew Honnibal
4a0f40ec2d * Ensure data is packaged in vocab 2015-07-27 02:14:36 +02:00
Matthew Honnibal
b532f4eaa2 * Ensure serialize is packaged. 2015-07-27 01:51:37 +02:00
Matthew Honnibal
62da5eb338 * Inc version 2015-07-26 22:22:54 +02:00
Matthew Honnibal
65f3ce6c52 * Require preshed 0.41 2015-07-25 22:36:43 +02:00
Matthew Honnibal
287d90e792 * Use thinc 3.3 2015-07-24 04:52:50 +02:00
Matthew Honnibal
06eac32610 * Add cfile.pyx 2015-07-23 01:10:36 +02:00
Matthew Honnibal
a9149fdcbd * Compile attrs.pyx 2015-07-17 16:39:25 +02:00
Matthew Honnibal
db9dfd2e23 * Major refactor of serialization. Nearly complete now. 2015-07-17 01:27:54 +02:00
Matthew Honnibal
38ca0c33f5 Merge branch 'neuralnet' into refactor
Mostly refactors parser, to use new thinc3.2 Example class.
Aim is to remove use of shared memory, so that we can parallelize
over documents easily.

Conflicts:
	setup.py
	spacy/syntax/parser.pxd
	spacy/syntax/parser.pyx
	spacy/syntax/stateclass.pyx
2015-07-14 14:13:47 +02:00
Matthew Honnibal
d87d71caf4 * Compile the new modules after refactor 2015-07-13 22:29:33 +02:00
Matthew Honnibal
703ca40420 * Inc version 2015-07-08 20:07:23 +02:00
Matthew Honnibal
1e8dd0e2c5 * Comple senses.pyx 2015-07-01 18:49:15 +02:00
Matthew Honnibal
90e2059200 * Include spacy.munge in the built library 2015-06-30 18:35:39 +02:00
Matthew Honnibal
5d595b5a8c * Inc versions 2015-06-30 18:11:06 +02:00
Matthew Honnibal
8e7ffd2cdd * Use thinc 3.1 2015-06-29 02:13:23 +02:00
Matthew Honnibal
9282a8e72c * Prepare for new models to be plugged in by using Example class 2015-06-28 11:02:35 +02:00
Matthew Honnibal
4944d3ba20 * Update requirement to thinc 3.0 2015-06-28 06:21:20 +02:00
Matthew Honnibal
dc10aa2518 * Increment version 2015-06-24 04:52:15 +02:00
Matthew Honnibal
34c0ef2ee8 * Don't compile the orig_arc_eager and tree_arc_eager modules used for the EMNLP paper 2015-06-23 05:38:17 +02:00
Matthew Honnibal
a5ae98a543 * Add tree_arc_eager to setup.py 2015-06-15 08:22:59 +02:00
Matthew Honnibal
bcfdf126a4 * Add toggle for OrigArcEager system 2015-06-14 20:28:14 +02:00
Matthew Honnibal
e2f9a80713 * Remove old _state imports 2015-06-10 07:09:17 +02:00
Matthew Honnibal
d70304b7dd * Require newer thinc 2015-06-10 04:20:42 +02:00
Matthew Honnibal
09617a4638 * Whitespace 2015-06-09 21:20:33 +02:00
Matthew Honnibal
00a0dfcb59 * Avoid shipping the spacy.munge package 2015-06-08 00:54:13 +02:00
Matthew Honnibal
22f1ad012e * Add spacy.munge to list of packages 2015-06-07 22:28:13 +02:00
Matthew Honnibal
ce8e524825 * Fix requirements in setup.py 2015-06-07 22:24:21 +02:00
Matthew Honnibal
48bc4122d8 * Upd version in setup.py 2015-06-07 19:05:28 +02:00
Matthew Honnibal
cc7439a16b * Don't use alignment.pyx file, move functionality to spacy.gold 2015-05-24 21:51:15 +02:00
Matthew Honnibal
fc75210941 * Move spacy.syntax.conll to spacy.gold 2015-05-24 21:35:02 +02:00
Matthew Honnibal
bfeb29ebd1 * Tmp commit 2015-05-24 02:50:14 +02:00
Matthew Honnibal
03ebf70a66 * Inc version to 0.84 2015-05-12 02:38:51 +02:00
Jordan Suchow
3a8d9b37a6 Remove trailing whitespace 2015-04-19 13:01:38 -07:00
Jordan Suchow
5f0f940a1f Remove unused imports 2015-04-19 01:05:22 -07:00
Matthew Honnibal
716ba06711 * Inc version 2015-04-16 04:28:15 +02:00
Matthew Honnibal
05d0f078bb * Inc version 2015-04-13 22:29:31 +02:00
Matthew Honnibal
ab53855dfe * Bump version 2015-04-13 06:08:22 +02:00
Matthew Honnibal
11c4794e56 * Bump version number 2015-04-12 07:17:32 +02:00
Matthew Honnibal
8f68b864c4 * Move Span/Spans to separate files. Currently duplicates lots of Tokens functionality. Should probably be integrated into Tokens 2015-03-26 16:44:48 +01:00
Matthew Honnibal
e99f19dd6c * Fix clean function 2015-03-26 16:44:44 +01:00
Matthew Honnibal
357dcdcc01 * Fix clean function 2015-03-26 16:44:44 +01:00
Matthew Honnibal
8da53cbe3c * Fix setup.py, so that when compiling, only the necessary files are compiled 2015-03-26 16:44:43 +01:00
Matthew Honnibal
6e86790a4e * Add new syntax modules to setup.py 2015-03-26 16:44:42 +01:00
Matthew Honnibal
c341bfb0a2 * Inc version 2015-03-03 05:46:14 -05:00
Matthew Honnibal
827a2337b0 * Inc version 2015-02-27 03:56:54 -05:00
Matthew Honnibal
74015da94b * Inc version 2015-02-23 15:40:41 -05:00
Matthew Honnibal
6102360111 * Add -Wno-strict-prototypes, to suppress warning 2015-02-21 20:04:37 -05:00
Matthew Honnibal
ba1d3ddd7f * Move -lc++ link arg to only be used if darwin is OS. Should actually check whether GCC is compiler 2015-02-18 06:10:43 -05:00
Matthew Honnibal
59b46e4c2f * Move libc++ argument back under check for darwin. This assumes that extensions on OSX will be built with clang, but OSX GCC builds are also possible. Need to detect compiler and disable this flag 2015-02-18 06:03:45 -05:00
Matthew Honnibal
aa475673ee * Tweak compile args for OSX 2015-02-18 05:41:11 -05:00
Matthew Honnibal
b4edd1d907 * Make new compile args conditional on darwin, as they're invalid on Linux 2015-02-18 05:09:50 -05:00
Matthew Honnibal
e885903dc6 * Add compile args to fix conda compilation on OSX, and increment version 2015-02-18 05:01:27 -05:00
Matthew Honnibal
69d27d55b0 * Inc version, with new orphan-token bug fix 2015-02-16 16:52:54 -05:00
Matthew Honnibal
789a6fe462 * Inc version --- 0.63 seems to have been packaged incorrectly, to not include a bug fix to tokens.pyx to transfer ownership to Token objects 2015-02-16 11:56:14 -05:00
Matthew Honnibal
773d209405 * Inc version to 0.63 2015-02-11 18:39:41 -05:00
Matthew Honnibal
f0a9d2cb9c * Inc version 2015-02-11 14:20:57 -05:00
Matthew Honnibal
5ff2b5c8f0 * Inc version 2015-02-10 10:16:09 -05:00
Matthew Honnibal
29bdf0d05a * Inc version 2015-02-09 10:22:06 -05:00
Matthew Honnibal
407bb5da8b * Increment version 2015-02-09 09:46:20 -05:00
Matthew Honnibal
933c188eb5 * Inc version 2015-02-07 13:14:27 -05:00
Matthew Honnibal
ef795aece8 * Upd release 2015-02-07 12:26:34 -05:00
Matthew Honnibal
330b1a7a3d * Inc version 2015-02-07 11:32:13 -05:00
Matthew Honnibal
bfe1bcc02d * Rename 0.4.0 to 0.40 2015-02-01 18:32:01 +11:00
Matthew Honnibal
dea1245311 * Require advanced version of cymem 2015-02-01 17:04:59 +11:00
Matthew Honnibal
ac20a53509 * Change version to 0.4.0 2015-02-01 16:54:13 +11:00
Matthew Honnibal
d1a5091052 * Require six 2015-02-01 16:24:50 +11:00
Matthew Honnibal
754f4aed8e * Inc version 2015-01-31 22:49:46 +11:00
Matthew Honnibal
a3955fd8d5 * Require plac 2015-01-31 13:50:53 +11:00
Matthew Honnibal
6c081dd1fc * Handle failure when numpy headers are already installed correctly 2015-01-30 19:48:19 +11:00
Matthew Honnibal
f0bbffca8d * Fix the way numpy headers are installed during compilation from source 2015-01-30 18:14:45 +11:00
Matthew Honnibal
781dd712dc * Fix numpy commit problem 2015-01-28 14:00:20 +11:00
Matthew Honnibal
8cd5a91063 * Inc version, and add wget as requirement 2015-01-25 23:00:54 +11:00
Matthew Honnibal
419fef7627 * Inc version 2015-01-25 22:15:47 +11:00
Matthew Honnibal
77a61a8970 * Inc version 2015-01-25 18:51:35 +11:00
Matthew Honnibal
7a750983b9 * Don't package word vectors in source dist 2015-01-25 16:58:38 +11:00
Matthew Honnibal
845bd2e50d * Add parts_of_speech to setup 2015-01-25 16:32:48 +11:00
Matthew Honnibal
7588adf5e7 * Add numpy to install requires 2015-01-25 14:49:10 +11:00
Matthew Honnibal
0250f39741 * Inc version 2015-01-25 02:25:16 +11:00
Matthew Honnibal
b183dff72d * Remove stray print statement from setup 2015-01-22 02:06:42 +11:00
Matthew Honnibal
e579dd39ca * Load numpy headers 2015-01-17 16:19:54 +11:00
Matthew Honnibal
9818d7419e * Inc version 2015-01-09 05:14:29 +11:00
Matthew Honnibal
a0eb450e82 * Inc version 2015-01-08 01:19:57 +11:00
Matthew Honnibal
03a10e6cf2 * Inc version --- last didn't pack the correct cpp files. 2015-01-08 01:08:17 +11:00
Matthew Honnibal
c096fe84f7 * Inc version 2015-01-08 00:10:31 +11:00
Matthew Honnibal
2f9884a2d5 * Rename 2015-01-06 13:05:43 +11:00
Matthew Honnibal
b91d0cb584 * Increment version 2015-01-06 12:35:11 +11:00
Matthew Honnibal
def7e98bd3 * Add monkey-patch to fix pypy compilation 2015-01-06 12:34:55 +11:00
Matthew Honnibal
cda5f7aeae * Fix setup 2015-01-06 03:25:08 +11:00
Matthew Honnibal
3306ae1488 * Inc version 2015-01-05 19:11:23 +11:00
Matthew Honnibal
87fe01612a * Remove dependency on numpy and ujson 2015-01-05 19:11:12 +11:00
Matthew Honnibal
dd5a6be171 * Compile spacy.orth 2015-01-05 17:55:15 +11:00
Matthew Honnibal
1dd663ea03 * Inc version 2015-01-05 13:18:12 +11:00
Matthew Honnibal
f841a32ff7 * Inc version 2015-01-05 13:02:03 +11:00
Matthew Honnibal
0217df5779 * Increment version 2015-01-05 12:51:58 +11:00
Matthew Honnibal
170b93e89a * Inc version 2015-01-05 11:54:52 +11:00
Matthew Honnibal
454bec86dc * Increment version 2015-01-05 05:41:15 +11:00
Matthew Honnibal
83fa1850e2 * Refactor setup 2015-01-05 05:30:56 +11:00
Matthew Honnibal
e0c85371d1 * Increment version 2015-01-04 21:21:03 +11:00
Matthew Honnibal
0cd7652545 * Use headers_workaround to avoid having install dependencies, given setuptools bug 209. 2015-01-04 21:14:07 +11:00
Matthew Honnibal
27c737a80f * Specify murmurhash in the setup_requires field 2015-01-04 01:59:14 +11:00
Matthew Honnibal
a6f3c0c329 * Bump version number 2015-01-03 23:12:43 +11:00
Matthew Honnibal
a179f1fc52 * Fix setup.py 2015-01-03 21:02:10 +11:00
Matthew Honnibal
9b5cef8d4a * Move around data files 2015-01-03 01:59:43 +11:00
Matthew Honnibal
d48f90fbab * Write some more metadata in setup.py 2015-01-02 21:56:43 +11:00
Matthew Honnibal
a04e164a37 * Move tagger.pyx to _ml.pyx 2014-12-30 21:20:55 +11:00
Matthew Honnibal
ed0ff63c09 * Compile attrs and parser in setup 2014-12-23 15:18:20 +11:00
Matthew Honnibal
2a89d70429 * Add vocab.pyx to setup, and ensure we can import spacy.en.lang 2014-12-21 06:03:53 +11:00
Matthew Honnibal
87e9487d76 * Work on parser 2014-12-17 21:10:12 +11:00
Matthew Honnibal
7831b06610 * Compile morphology.pyx file 2014-12-10 08:09:13 +11:00
Matthew Honnibal
ef4398b204 * Rearrange POS stuff, so that language-specific stuff can live in language-specific modules 2014-12-07 23:52:41 +11:00
Matthew Honnibal
91e8d9ea1c * Compile context.pyx and tagger.pyx modules 2014-12-07 15:29:54 +11:00
Matthew Honnibal
a14f9eaf63 * Add index.pyx to setup 2014-12-04 22:14:11 +11:00
Matthew Honnibal
d0d812c548 * Hack setup.py to exclude tagger stuff 2014-12-03 11:06:57 +11:00
Matthew Honnibal
b934bf1c69 * Compile IOB 2014-11-12 23:21:40 +11:00
Matthew Honnibal
d5e9dce039 * Compile ner NER code 2014-11-11 21:10:22 +11:00
Matthew Honnibal
dbbb914480 * Upd setup 2014-11-05 20:45:44 +11:00
Matthew Honnibal
67c8c8019f * Update lexeme serialization, using a binary file format 2014-10-30 01:01:00 +11:00
Matthew Honnibal
5ebe14f353 * Add greedy pos tagger 2014-10-22 10:17:26 +11:00
Matthew Honnibal
aba4a7c7ea * Remove ptb3 file from setup 2014-09-25 18:41:25 +02:00
Matthew Honnibal
b15619e170 * Use PointerHash instead of locally provided _hashing module 2014-09-25 18:23:35 +02:00
Matthew Honnibal
ac522e2553 * Switch from own memory class to cymem, in pip 2014-09-17 23:09:24 +02:00
Matthew Honnibal
5a20dfc03e * Add memory management code 2014-09-17 20:02:06 +02:00
Matthew Honnibal
0447279c57 * PointerHash working, efficiency is good. 6-7 mins 2014-09-13 16:43:59 +02:00
Matthew Honnibal
b488224c09 * Restoring Lexeme-as-struct 2014-09-10 20:41:37 +02:00
Matthew Honnibal
e80d3b9784 * Compile tokens in setup 2014-09-10 19:41:19 +02:00
Matthew Honnibal
7dac9b9ccb * Fix setup script 2014-09-01 23:41:59 +02:00
Matthew Honnibal
68bae2fec6 * More refactoring 2014-08-25 16:42:22 +02:00
Matthew Honnibal
3b793cf4f7 * Tests passing for new Word object version 2014-08-24 18:13:53 +02:00
Matthew Honnibal
89d6faa9c9 * Move en_ptb to ptb3 2014-08-22 04:24:05 +02:00
Matthew Honnibal
d42cdbb446 * Compile orthography.latin.pyx 2014-08-20 17:03:19 +02:00
Matthew Honnibal
01469b0888 * Refactor spacy so that chunks return arrays of lexemes, so that there is properly one lexeme per word. 2014-08-18 19:14:00 +02:00
Matthew Honnibal
865cacfaf7 * Remove dependence on murmurhash 2014-08-16 17:37:09 +02:00
Matthew Honnibal
7fd9b2f1f8 * Add murmurhash to setup while we figure out cython includes 2014-08-15 23:56:57 +02:00
Matthew Honnibal
365a2af756 * Restore happax. commit uncommited work 2014-08-02 21:27:03 +01:00
Matthew Honnibal
18fb76b2c4 * Removed happax. Not sure if good idea. 2014-08-02 20:53:35 +01:00
Matthew Honnibal
d4b8bc07ce * Use FixedTable to control index size 2014-08-01 07:27:48 +01:00
Matthew Honnibal
a235804730 * Fix setup.py 2014-07-31 02:03:53 +01:00
Matthew Honnibal
5461399924 * Fix setup.py 2014-07-31 02:03:10 +01:00
Matthew Honnibal
b9016c4633 * Switch to using sparsehash and murmurhash libraries out of pip 2014-07-25 15:47:27 +01:00
Matthew Honnibal
1c5ab3b49a * Add tokens module to setup 2014-07-07 12:51:23 +02:00
Matthew Honnibal
648d1fe3ed * Compile en_ptb 2014-07-07 05:10:28 +02:00
Matthew Honnibal
0c1be7effe * Compile string_tools module 2014-07-07 04:24:00 +02:00
Matthew Honnibal
ca7045f3f2 * Add build/setup stuff 2014-07-05 20:49:34 +02:00