Commit Graph

15937 Commits

Author SHA1 Message Date
Matthew Honnibal
77385d5580 * Make .pxd file for huffman codec 2015-07-13 13:54:51 +02:00
Matthew Honnibal
0628e0e2a8 * Add tests for huffman encoding 2015-07-13 12:58:07 +02:00
Matthew Honnibal
083b6ea7ae * Clean up encoder a bit. now read for integration into Vocab. 2015-07-13 12:57:22 +02:00
Matthew Honnibal
8d0f1d98da * Draft dockstring for HuffmanCache 2015-07-13 12:01:18 +02:00
Matthew Honnibal
281f1faefb * Nearly finished huffman coder 2015-07-12 23:48:46 +02:00
Matthew Honnibal
e1a25fba32 * Work on huffman coder 2015-07-12 19:58:05 +02:00
Matthew Honnibal
3fb9de2d13 * Remove vector[bint], in favor of simple Code struct. 2015-07-12 17:58:27 +02:00
Matthew Honnibal
aa7bfd932b * Work on compressor 2015-07-12 16:03:43 +02:00
Matthew Honnibal
14eafcab15 * Refactor to use vector[bint] 2015-07-12 05:27:47 +02:00
Matthew Honnibal
6a6e852a39 * Refactor huffman coding stuff into class 2015-07-12 05:06:36 +02:00
Matthew Honnibal
aad96fdb5c * Improve efficiency of huffman coding 2015-07-12 01:31:37 +02:00
Matthew Honnibal
ff9ff6f3fa * Ensure unseen words are given low log probability 2015-07-12 01:31:09 +02:00
Matthew Honnibal
9d3b0d83de * Refactor huffman coding 2015-07-11 22:27:43 +02:00
Matthew Honnibal
8d29406cd6 * Rename span.right to span.rights 2015-07-11 22:15:04 +02:00
Matthew Honnibal
da9f358166 * Fix span getting 2015-07-11 21:41:41 +02:00
Matthew Honnibal
11e8f2ffb4 * Huffman codes working 2015-07-11 20:01:10 +02:00
Matthew Honnibal
cb6fc81909 * Work on huffman coding. 2015-07-11 15:23:35 +02:00
Matthew Honnibal
4c9b77fe95 * Begin working on serialization code 2015-07-11 10:57:30 +02:00
Matthew Honnibal
11a380e00f * Draft v0.89 update notes 2015-07-10 19:41:42 +02:00
Matthew Honnibal
53d1f5b2eb * Rename Span.head to Span.root. 2015-07-09 17:30:58 +02:00
Matthew Honnibal
c0255ed7d8 * Allow slice indexing in Doc.__getitem__, returning a Span object 2015-07-09 15:15:32 +02:00
Matthew Honnibal
7d2964f673 * Test that whitespace is not assigned a tag 2015-07-09 13:31:40 +02:00
Matthew Honnibal
b5223c4824 * Add whitespace to specials.json 2015-07-09 13:31:12 +02:00
Matthew Honnibal
89a91ad726 * Add SPACE part-of-speech tag, and train tagger to assign it. Also train tagger not to make whitespace an entity 2015-07-09 13:30:41 +02:00
Matthew Honnibal
f95da0bd52 * Allow tests to read model dir from SPACY_DATA environment variable 2015-07-09 12:18:02 +02:00
Matthew Honnibal
55f1042443 * Improve efficiency of L and R features, correcting the non-linear-in-length problem. 2015-07-09 12:17:26 +02:00
Matthew Honnibal
70d2acb579 * Fix edge features 2015-07-09 12:15:01 +02:00
Matthew Honnibal
8a7bbd5850 * Announce v0.88 2015-07-09 12:12:45 +02:00
Matthew Honnibal
703ca40420 * Inc version 2015-07-08 20:07:23 +02:00
Matthew Honnibal
adb868bdad * Add warning for models not found in parser 2015-07-08 20:04:55 +02:00
Matthew Honnibal
05b28ec9eb * Add warning for models not found in parser 2015-07-08 20:02:13 +02:00
Matthew Honnibal
ef700401a6 * Add warning for models not found in parser 2015-07-08 20:00:46 +02:00
Matthew Honnibal
6218d8b389 * Add warning for models not found in parser 2015-07-08 19:59:16 +02:00
Matthew Honnibal
f6a6c39ce8 * Add warning for models not found in parser 2015-07-08 19:52:30 +02:00
Matthew Honnibal
78db7e32f7 * Remove has_sense method from Lexeme declaration 2015-07-08 19:41:20 +02:00
Matthew Honnibal
6ddb2f5e45 * Restore merge_mwe in English class 2015-07-08 19:35:30 +02:00
Matthew Honnibal
6859f6adac * Restore merge_mwe in English class 2015-07-08 19:34:55 +02:00
Matthew Honnibal
3c270fc8ff * Remove has_sense method from Lexeme 2015-07-08 19:28:29 +02:00
Matthew Honnibal
b64c843861 * Remove senses attr 2015-07-08 19:26:24 +02:00
Matthew Honnibal
1d3a592edf * Remove the senses attr from LexemeC, to keep data compatibility 2015-07-08 19:24:44 +02:00
Matthew Honnibal
0ceb1f71c2 * Update parse features 2015-07-08 19:11:36 +02:00
Matthew Honnibal
2e51b5027a * Alias Doc to Tokens, for backwards compatibility 2015-07-08 18:59:35 +02:00
Matthew Honnibal
462301d9e6 * Fix reference to Tokens in documentation 2015-07-08 18:58:25 +02:00
Matthew Honnibal
e3c53f5ecd * Fix mention of Tokens in docstring 2015-07-08 18:56:27 +02:00
Matthew Honnibal
783867c1ad * Update quickstart.rst for Tokens --> Doc rename 2015-07-08 18:54:08 +02:00
Matthew Honnibal
bb522496dd * Rename Tokens to Doc 2015-07-08 18:53:00 +02:00
Matthew Honnibal
d0fc7f5ba9 * Relabel docs sections 2015-07-08 18:23:49 +02:00
Matthew Honnibal
ec398ef1d0 * Relabel docs sections 2015-07-08 18:20:00 +02:00
Matthew Honnibal
38f6e92ffb * Add docs for vocab and string store 2015-07-08 18:00:33 +02:00
Matthew Honnibal
2f7110e852 * Add using/ docs. 2015-07-08 17:59:07 +02:00