Commit Graph

471 Commits

Author SHA1 Message Date
Matthew Honnibal
0447279c57 * PointerHash working, efficiency is good. 6-7 mins 2014-09-13 16:43:59 +02:00
Matthew Honnibal
b488224c09 * Restoring Lexeme-as-struct 2014-09-10 20:41:37 +02:00
Matthew Honnibal
e80d3b9784 * Compile tokens in setup 2014-09-10 19:41:19 +02:00
Matthew Honnibal
7dac9b9ccb * Fix setup script 2014-09-01 23:41:59 +02:00
Matthew Honnibal
68bae2fec6 * More refactoring 2014-08-25 16:42:22 +02:00
Matthew Honnibal
3b793cf4f7 * Tests passing for new Word object version 2014-08-24 18:13:53 +02:00
Matthew Honnibal
89d6faa9c9 * Move en_ptb to ptb3 2014-08-22 04:24:05 +02:00
Matthew Honnibal
d42cdbb446 * Compile orthography.latin.pyx 2014-08-20 17:03:19 +02:00
Matthew Honnibal
01469b0888 * Refactor spacy so that chunks return arrays of lexemes, so that there is properly one lexeme per word. 2014-08-18 19:14:00 +02:00
Matthew Honnibal
865cacfaf7 * Remove dependence on murmurhash 2014-08-16 17:37:09 +02:00
Matthew Honnibal
7fd9b2f1f8 * Add murmurhash to setup while we figure out cython includes 2014-08-15 23:56:57 +02:00
Matthew Honnibal
365a2af756 * Restore happax. commit uncommited work 2014-08-02 21:27:03 +01:00
Matthew Honnibal
18fb76b2c4 * Removed happax. Not sure if good idea. 2014-08-02 20:53:35 +01:00
Matthew Honnibal
d4b8bc07ce * Use FixedTable to control index size 2014-08-01 07:27:48 +01:00
Matthew Honnibal
a235804730 * Fix setup.py 2014-07-31 02:03:53 +01:00
Matthew Honnibal
5461399924 * Fix setup.py 2014-07-31 02:03:10 +01:00
Matthew Honnibal
b9016c4633 * Switch to using sparsehash and murmurhash libraries out of pip 2014-07-25 15:47:27 +01:00
Matthew Honnibal
1c5ab3b49a * Add tokens module to setup 2014-07-07 12:51:23 +02:00
Matthew Honnibal
648d1fe3ed * Compile en_ptb 2014-07-07 05:10:28 +02:00
Matthew Honnibal
0c1be7effe * Compile string_tools module 2014-07-07 04:24:00 +02:00
Matthew Honnibal
ca7045f3f2 * Add build/setup stuff 2014-07-05 20:49:34 +02:00