Commit Graph

974 Commits

Author SHA1 Message Date
Matthew Honnibal
47a4371fea * Upd tokenizer with i.e. tests 2015-02-18 06:37:04 -05:00
Matthew Honnibal
ba1d3ddd7f * Move -lc++ link arg to only be used if darwin is OS. Should actually check whether GCC is compiler 2015-02-18 06:10:43 -05:00
Matthew Honnibal
59b46e4c2f * Move libc++ argument back under check for darwin. This assumes that extensions on OSX will be built with clang, but OSX GCC builds are also possible. Need to detect compiler and disable this flag 2015-02-18 06:03:45 -05:00
Matthew Honnibal
aa475673ee * Tweak compile args for OSX 2015-02-18 05:41:11 -05:00
Matthew Honnibal
b4edd1d907 * Make new compile args conditional on darwin, as they're invalid on Linux 2015-02-18 05:09:50 -05:00
Matthew Honnibal
e885903dc6 * Add compile args to fix conda compilation on OSX, and increment version 2015-02-18 05:01:27 -05:00
Matthew Honnibal
69d27d55b0 * Inc version, with new orphan-token bug fix 2015-02-16 16:52:54 -05:00
Matthew Honnibal
cae077b583 * Work on fixing orphaned Token objects bug 2015-02-16 15:20:31 -05:00
Matthew Honnibal
789a6fe462 * Inc version --- 0.63 seems to have been packaged incorrectly, to not include a bug fix to tokens.pyx to transfer ownership to Token objects 2015-02-16 11:56:14 -05:00
Matthew Honnibal
9dbc31d72c * Add test from NSchrading 2015-02-16 11:49:31 -05:00
Matthew Honnibal
274b802830 * Fix docs bug 2015-02-11 20:07:39 -05:00
Matthew Honnibal
773d209405 * Inc version to 0.63 2015-02-11 18:39:41 -05:00
Matthew Honnibal
cd6367e404 * Fix cosine function in documentation 2015-02-11 18:08:19 -05:00
Matthew Honnibal
7572e31f5e * Pass ownership of C data to Token instances if Tokens object is being garbage-collected, but Token instances are staying alive. 2015-02-11 18:05:06 -05:00
Matthew Honnibal
db3f26a51b * Remove version note 2015-02-11 18:03:23 -05:00
Matthew Honnibal
4258b1490a * Improve API docs for Token 2015-02-11 18:03:06 -05:00
Matthew Honnibal
64645a1c2f * Improve docstring on English 2015-02-11 15:13:20 -05:00
Matthew Honnibal
f0a9d2cb9c * Inc version 2015-02-11 14:20:57 -05:00
Matthew Honnibal
594e50bd45 * Add option to download speech-parsing data set. 2015-02-11 14:20:29 -05:00
Matthew Honnibal
0b7e769211 * Add POS tags to support SWBD tag set 2015-02-11 14:08:28 -05:00
Matthew Honnibal
e425de6d2b Merge branch 'develop' of ssh://github.com/honnibal/spaCy into develop 2015-02-10 10:16:24 -05:00
Matthew Honnibal
5ff2b5c8f0 * Inc version 2015-02-10 10:16:09 -05:00
Matthew Honnibal
312b3a45f3 * Fix issue #19: Allow parsing/pos tagging of empty strings 2015-02-10 10:15:58 -05:00
leofidus
363473aeed Add rokenizer test for zero length string 2015-02-10 08:20:32 -05:00
honnibal
ae36067314 Merge pull request #21 from leofidus/test_notoken
Add rokenizer test for zero length string
2015-02-11 00:19:38 +11:00
Matthew Honnibal
2a0615104b * Upd download script 2015-02-09 10:22:59 -05:00
Matthew Honnibal
29bdf0d05a * Inc version 2015-02-09 10:22:06 -05:00
Matthew Honnibal
407bb5da8b * Increment version 2015-02-09 09:46:20 -05:00
Matthew Honnibal
ee33be31dd * Fix parser training script 2015-02-09 03:57:56 -05:00
Matthew Honnibal
5c3513583d * Clear buffered python tokens when modifying the Tokens object. Need to clean this up, and modify via a method on Tokens. 2015-02-09 03:57:10 -05:00
Matthew Honnibal
be5536d239 * Fix Issue #22: PRP and PRP$ were mapped to NOUN. Should be PRON. 2015-02-08 18:36:18 -05:00
Matthew Honnibal
99f0a315f9 * Add test for Issue 24 2015-02-08 18:30:46 -05:00
Matthew Honnibal
0492cee8b4 * Fix Issue #24: Lemmas are empty when the L field is missing for special-cased tokens 2015-02-08 18:30:30 -05:00
Matthew Honnibal
3e8c87af1a * Extend parse tree navigation tests 2015-02-07 18:28:45 -05:00
Matthew Honnibal
933c188eb5 * Inc version 2015-02-07 13:14:27 -05:00
Matthew Honnibal
aadc57ab00 * Add tests for tokens api 2015-02-07 13:14:07 -05:00
Matthew Honnibal
d229fbd228 * Give better error on out-of-bounds array access 2015-02-07 12:59:12 -05:00
Matthew Honnibal
ab8bb047d0 * Fix negative index for __getitem__ 2015-02-07 12:58:46 -05:00
Matthew Honnibal
ef795aece8 * Upd release 2015-02-07 12:26:34 -05:00
Matthew Honnibal
44c7eafe44 * Fix download.py 2015-02-07 12:00:36 -05:00
Matthew Honnibal
b6c8624b82 Merge branch 'master' of ssh://github.com/honnibal/spaCy 2015-02-07 11:53:13 -05:00
Matthew Honnibal
6ca7f2eedc * Upd download script 2015-02-07 11:32:33 -05:00
Matthew Honnibal
330b1a7a3d * Inc version 2015-02-07 11:32:13 -05:00
Matthew Honnibal
6b68607b1f * Add some tests for the code in the index.html docstrings 2015-02-07 08:52:13 -05:00
Matthew Honnibal
a7e4f0a86c * Make corrections to example code 2015-02-07 08:45:09 -05:00
Matthew Honnibal
f0e0588833 * Fill L2 norm attribute on LexemeC struct 2015-02-07 08:44:42 -05:00
Matthew Honnibal
75f9b7d6bf * Add L2 norm field to LexemeC struct 2015-02-07 08:43:17 -05:00
Matthew Honnibal
51b618d646 * Add a has_repvec property to Lexeme, and a check function to check flags 2015-02-07 08:42:44 -05:00
Matthew Honnibal
321b402739 * Store the l2 norm of the word's vector 2015-02-07 08:42:16 -05:00
leofidus
0ae05f77ab Add rokenizer test for zero length string 2015-02-07 03:01:44 +01:00