Commit Graph

10469 Commits

Author SHA1 Message Date
Matthew Honnibal
e885903dc6 * Add compile args to fix conda compilation on OSX, and increment version 2015-02-18 05:01:27 -05:00
Matthew Honnibal
69d27d55b0 * Inc version, with new orphan-token bug fix 2015-02-16 16:52:54 -05:00
Matthew Honnibal
cae077b583 * Work on fixing orphaned Token objects bug 2015-02-16 15:20:31 -05:00
Matthew Honnibal
789a6fe462 * Inc version --- 0.63 seems to have been packaged incorrectly, to not include a bug fix to tokens.pyx to transfer ownership to Token objects 2015-02-16 11:56:14 -05:00
Matthew Honnibal
9dbc31d72c * Add test from NSchrading 2015-02-16 11:49:31 -05:00
Matthew Honnibal
274b802830 * Fix docs bug 2015-02-11 20:07:39 -05:00
Matthew Honnibal
773d209405 * Inc version to 0.63 2015-02-11 18:39:41 -05:00
Matthew Honnibal
cd6367e404 * Fix cosine function in documentation 2015-02-11 18:08:19 -05:00
Matthew Honnibal
7572e31f5e * Pass ownership of C data to Token instances if Tokens object is being garbage-collected, but Token instances are staying alive. 2015-02-11 18:05:06 -05:00
Matthew Honnibal
db3f26a51b * Remove version note 2015-02-11 18:03:23 -05:00
Matthew Honnibal
4258b1490a * Improve API docs for Token 2015-02-11 18:03:06 -05:00
Matthew Honnibal
64645a1c2f * Improve docstring on English 2015-02-11 15:13:20 -05:00
Matthew Honnibal
f0a9d2cb9c * Inc version 2015-02-11 14:20:57 -05:00
Matthew Honnibal
594e50bd45 * Add option to download speech-parsing data set. 2015-02-11 14:20:29 -05:00
Matthew Honnibal
0b7e769211 * Add POS tags to support SWBD tag set 2015-02-11 14:08:28 -05:00
Matthew Honnibal
e425de6d2b Merge branch 'develop' of ssh://github.com/honnibal/spaCy into develop 2015-02-10 10:16:24 -05:00
Matthew Honnibal
5ff2b5c8f0 * Inc version 2015-02-10 10:16:09 -05:00
Matthew Honnibal
312b3a45f3 * Fix issue #19: Allow parsing/pos tagging of empty strings 2015-02-10 10:15:58 -05:00
leofidus
363473aeed Add rokenizer test for zero length string 2015-02-10 08:20:32 -05:00
honnibal
ae36067314 Merge pull request #21 from leofidus/test_notoken
Add rokenizer test for zero length string
2015-02-11 00:19:38 +11:00
Matthew Honnibal
2a0615104b * Upd download script 2015-02-09 10:22:59 -05:00
Matthew Honnibal
29bdf0d05a * Inc version 2015-02-09 10:22:06 -05:00
Matthew Honnibal
407bb5da8b * Increment version 2015-02-09 09:46:20 -05:00
Matthew Honnibal
ee33be31dd * Fix parser training script 2015-02-09 03:57:56 -05:00
Matthew Honnibal
5c3513583d * Clear buffered python tokens when modifying the Tokens object. Need to clean this up, and modify via a method on Tokens. 2015-02-09 03:57:10 -05:00
Matthew Honnibal
be5536d239 * Fix Issue #22: PRP and PRP$ were mapped to NOUN. Should be PRON. 2015-02-08 18:36:18 -05:00
Matthew Honnibal
99f0a315f9 * Add test for Issue 24 2015-02-08 18:30:46 -05:00
Matthew Honnibal
0492cee8b4 * Fix Issue #24: Lemmas are empty when the L field is missing for special-cased tokens 2015-02-08 18:30:30 -05:00
Matthew Honnibal
3e8c87af1a * Extend parse tree navigation tests 2015-02-07 18:28:45 -05:00
Matthew Honnibal
933c188eb5 * Inc version 2015-02-07 13:14:27 -05:00
Matthew Honnibal
aadc57ab00 * Add tests for tokens api 2015-02-07 13:14:07 -05:00
Matthew Honnibal
d229fbd228 * Give better error on out-of-bounds array access 2015-02-07 12:59:12 -05:00
Matthew Honnibal
ab8bb047d0 * Fix negative index for __getitem__ 2015-02-07 12:58:46 -05:00
Matthew Honnibal
ef795aece8 * Upd release 2015-02-07 12:26:34 -05:00
Matthew Honnibal
44c7eafe44 * Fix download.py 2015-02-07 12:00:36 -05:00
Matthew Honnibal
b6c8624b82 Merge branch 'master' of ssh://github.com/honnibal/spaCy 2015-02-07 11:53:13 -05:00
Matthew Honnibal
6ca7f2eedc * Upd download script 2015-02-07 11:32:33 -05:00
Matthew Honnibal
330b1a7a3d * Inc version 2015-02-07 11:32:13 -05:00
Matthew Honnibal
6b68607b1f * Add some tests for the code in the index.html docstrings 2015-02-07 08:52:13 -05:00
Matthew Honnibal
a7e4f0a86c * Make corrections to example code 2015-02-07 08:45:09 -05:00
Matthew Honnibal
f0e0588833 * Fill L2 norm attribute on LexemeC struct 2015-02-07 08:44:42 -05:00
Matthew Honnibal
75f9b7d6bf * Add L2 norm field to LexemeC struct 2015-02-07 08:43:17 -05:00
Matthew Honnibal
51b618d646 * Add a has_repvec property to Lexeme, and a check function to check flags 2015-02-07 08:42:44 -05:00
Matthew Honnibal
321b402739 * Store the l2 norm of the word's vector 2015-02-07 08:42:16 -05:00
leofidus
0ae05f77ab Add rokenizer test for zero length string 2015-02-07 03:01:44 +01:00
Matthew Honnibal
c7d8644149 * Fix regression on 'prob' attr of Token. 2015-02-03 03:32:18 +11:00
Matthew Honnibal
27986d7f5c * Fix standard conll file reading. Script needs refactoring. 2015-02-02 23:02:48 +11:00
Matthew Honnibal
c55a33d045 * Catch oracle errors 2015-02-02 23:02:04 +11:00
Matthew Honnibal
de772088e6 * Use parse tree for sbd in Tokens.sents 2015-02-02 12:17:32 +11:00
Matthew Honnibal
ba1e91189b * Fix 0.40 link in index 2015-02-02 12:16:53 +11:00