Commit Graph

935 Commits

Author SHA1 Message Date
Matthew Honnibal
a61dacb4e5 * Add tests for new subtree method 2015-03-03 05:41:00 -05:00
Matthew Honnibal
053814ffc8 * Report LAS in train script 2015-03-03 04:35:11 -05:00
Matthew Honnibal
b07632a9ef * Upd docs, improving description of parse tree navigation 2015-03-03 04:34:33 -05:00
Matthew Honnibal
dbe26f5793 * Add children and subtree methods to Token, which are generators to assist parse-tree navigation. 2015-03-03 04:18:41 -05:00
Matthew Honnibal
827a2337b0 * Inc version 2015-02-27 03:56:54 -05:00
Matthew Honnibal
ea90d136e8 * Fix bug in labelled parsing, that caused an 8% drop in labelled accuracy. 2015-02-27 03:56:10 -05:00
Matthew Honnibal
5e27bd0c4c * Add en language data, for tokenizer etc 2015-02-25 17:10:32 -05:00
Matthew Honnibal
1019939c7a * Whitespace 2015-02-24 23:03:02 -05:00
Matthew Honnibal
74015da94b * Inc version 2015-02-23 15:40:41 -05:00
Matthew Honnibal
caf046b220 * Hastily add method to apply tags from a list of strings, instead of predicting the tags. 2015-02-23 15:40:17 -05:00
Matthew Honnibal
6102360111 * Add -Wno-strict-prototypes, to suppress warning 2015-02-21 20:04:37 -05:00
Matthew Honnibal
47a4371fea * Upd tokenizer with i.e. tests 2015-02-18 06:37:04 -05:00
Matthew Honnibal
ba1d3ddd7f * Move -lc++ link arg to only be used if darwin is OS. Should actually check whether GCC is compiler 2015-02-18 06:10:43 -05:00
Matthew Honnibal
59b46e4c2f * Move libc++ argument back under check for darwin. This assumes that extensions on OSX will be built with clang, but OSX GCC builds are also possible. Need to detect compiler and disable this flag 2015-02-18 06:03:45 -05:00
Matthew Honnibal
aa475673ee * Tweak compile args for OSX 2015-02-18 05:41:11 -05:00
Matthew Honnibal
b4edd1d907 * Make new compile args conditional on darwin, as they're invalid on Linux 2015-02-18 05:09:50 -05:00
Matthew Honnibal
e885903dc6 * Add compile args to fix conda compilation on OSX, and increment version 2015-02-18 05:01:27 -05:00
Matthew Honnibal
69d27d55b0 * Inc version, with new orphan-token bug fix 2015-02-16 16:52:54 -05:00
Matthew Honnibal
cae077b583 * Work on fixing orphaned Token objects bug 2015-02-16 15:20:31 -05:00
Matthew Honnibal
789a6fe462 * Inc version --- 0.63 seems to have been packaged incorrectly, to not include a bug fix to tokens.pyx to transfer ownership to Token objects 2015-02-16 11:56:14 -05:00
Matthew Honnibal
9dbc31d72c * Add test from NSchrading 2015-02-16 11:49:31 -05:00
Matthew Honnibal
274b802830 * Fix docs bug 2015-02-11 20:07:39 -05:00
Matthew Honnibal
773d209405 * Inc version to 0.63 2015-02-11 18:39:41 -05:00
Matthew Honnibal
cd6367e404 * Fix cosine function in documentation 2015-02-11 18:08:19 -05:00
Matthew Honnibal
7572e31f5e * Pass ownership of C data to Token instances if Tokens object is being garbage-collected, but Token instances are staying alive. 2015-02-11 18:05:06 -05:00
Matthew Honnibal
db3f26a51b * Remove version note 2015-02-11 18:03:23 -05:00
Matthew Honnibal
4258b1490a * Improve API docs for Token 2015-02-11 18:03:06 -05:00
Matthew Honnibal
64645a1c2f * Improve docstring on English 2015-02-11 15:13:20 -05:00
Matthew Honnibal
f0a9d2cb9c * Inc version 2015-02-11 14:20:57 -05:00
Matthew Honnibal
594e50bd45 * Add option to download speech-parsing data set. 2015-02-11 14:20:29 -05:00
Matthew Honnibal
0b7e769211 * Add POS tags to support SWBD tag set 2015-02-11 14:08:28 -05:00
Matthew Honnibal
e425de6d2b Merge branch 'develop' of ssh://github.com/honnibal/spaCy into develop 2015-02-10 10:16:24 -05:00
Matthew Honnibal
5ff2b5c8f0 * Inc version 2015-02-10 10:16:09 -05:00
Matthew Honnibal
312b3a45f3 * Fix issue #19: Allow parsing/pos tagging of empty strings 2015-02-10 10:15:58 -05:00
leofidus
363473aeed Add rokenizer test for zero length string 2015-02-10 08:20:32 -05:00
honnibal
ae36067314 Merge pull request #21 from leofidus/test_notoken
Add rokenizer test for zero length string
2015-02-11 00:19:38 +11:00
Matthew Honnibal
2a0615104b * Upd download script 2015-02-09 10:22:59 -05:00
Matthew Honnibal
29bdf0d05a * Inc version 2015-02-09 10:22:06 -05:00
Matthew Honnibal
407bb5da8b * Increment version 2015-02-09 09:46:20 -05:00
Matthew Honnibal
ee33be31dd * Fix parser training script 2015-02-09 03:57:56 -05:00
Matthew Honnibal
5c3513583d * Clear buffered python tokens when modifying the Tokens object. Need to clean this up, and modify via a method on Tokens. 2015-02-09 03:57:10 -05:00
Matthew Honnibal
be5536d239 * Fix Issue #22: PRP and PRP$ were mapped to NOUN. Should be PRON. 2015-02-08 18:36:18 -05:00
Matthew Honnibal
99f0a315f9 * Add test for Issue 24 2015-02-08 18:30:46 -05:00
Matthew Honnibal
0492cee8b4 * Fix Issue #24: Lemmas are empty when the L field is missing for special-cased tokens 2015-02-08 18:30:30 -05:00
Matthew Honnibal
3e8c87af1a * Extend parse tree navigation tests 2015-02-07 18:28:45 -05:00
Matthew Honnibal
933c188eb5 * Inc version 2015-02-07 13:14:27 -05:00
Matthew Honnibal
aadc57ab00 * Add tests for tokens api 2015-02-07 13:14:07 -05:00
Matthew Honnibal
d229fbd228 * Give better error on out-of-bounds array access 2015-02-07 12:59:12 -05:00
Matthew Honnibal
ab8bb047d0 * Fix negative index for __getitem__ 2015-02-07 12:58:46 -05:00
Matthew Honnibal
ef795aece8 * Upd release 2015-02-07 12:26:34 -05:00