Commit Graph

953 Commits

Author SHA1 Message Date
Matthew Honnibal
d81b7be6a2 * Merge train.py 2015-03-26 16:44:41 +01:00
Matthew Honnibal
3a302ae6f2 * Merge train.py 2015-03-26 16:44:41 +01:00
Matthew Honnibal
2e3dc3dfe2 * Merge changes in tokens.pyx 2015-03-26 16:44:41 +01:00
Matthew Honnibal
8cc3524dc9 * Ws 2015-03-26 16:44:41 +01:00
Matthew Honnibal
3d0570685c * Add NER transition system 2015-03-26 16:44:41 +01:00
Matthew Honnibal
043b758cf4 * Resurrect old NER code. This version won't be the one that runs; we want to re-use the parser code. But for now this is a useful reference. 2015-03-26 16:44:41 +01:00
Matthew Honnibal
b139aa92ba * Start setting out how NER will be implemented in the data model 2015-03-26 16:44:41 +01:00
Matthew Honnibal
0962ffc095 * Fix issue #37: missing check_flag attribute from Token class 2015-03-26 15:06:26 +01:00
Matthew Honnibal
5032f2a5c7 * Fix nested lists 2015-03-25 14:38:59 +01:00
Matthew Honnibal
03636be9da * Fix table md 2015-03-25 14:36:12 +01:00
Matthew Honnibal
2a39e87891 * Fix table md 2015-03-25 14:35:42 +01:00
Matthew Honnibal
9937f73075 * Fix table md 2015-03-25 14:34:36 +01:00
Matthew Honnibal
22368706ce * Add CLA stuff 2015-03-25 14:32:50 +01:00
Matthew Honnibal
46e936adfa * Fix quickstart 2015-03-19 00:09:39 -04:00
Matthew Honnibal
d345f53dbc * Add bootstrap script to install instructions 2015-03-16 14:14:00 -04:00
Matthew Honnibal
b924a4d642 * Add bootstrap script 2015-03-16 14:01:36 -04:00
Matthew Honnibal
2e8d0e5d45 * Upd download script 2015-03-03 05:47:16 -05:00
Matthew Honnibal
c341bfb0a2 * Inc version 2015-03-03 05:46:14 -05:00
Matthew Honnibal
a61dacb4e5 * Add tests for new subtree method 2015-03-03 05:41:00 -05:00
Matthew Honnibal
053814ffc8 * Report LAS in train script 2015-03-03 04:35:11 -05:00
Matthew Honnibal
b07632a9ef * Upd docs, improving description of parse tree navigation 2015-03-03 04:34:33 -05:00
Matthew Honnibal
dbe26f5793 * Add children and subtree methods to Token, which are generators to assist parse-tree navigation. 2015-03-03 04:18:41 -05:00
Matthew Honnibal
827a2337b0 * Inc version 2015-02-27 03:56:54 -05:00
Matthew Honnibal
ea90d136e8 * Fix bug in labelled parsing, that caused an 8% drop in labelled accuracy. 2015-02-27 03:56:10 -05:00
Matthew Honnibal
5e27bd0c4c * Add en language data, for tokenizer etc 2015-02-25 17:10:32 -05:00
Matthew Honnibal
1019939c7a * Whitespace 2015-02-24 23:03:02 -05:00
Matthew Honnibal
74015da94b * Inc version 2015-02-23 15:40:41 -05:00
Matthew Honnibal
caf046b220 * Hastily add method to apply tags from a list of strings, instead of predicting the tags. 2015-02-23 15:40:17 -05:00
Matthew Honnibal
6102360111 * Add -Wno-strict-prototypes, to suppress warning 2015-02-21 20:04:37 -05:00
Matthew Honnibal
47a4371fea * Upd tokenizer with i.e. tests 2015-02-18 06:37:04 -05:00
Matthew Honnibal
ba1d3ddd7f * Move -lc++ link arg to only be used if darwin is OS. Should actually check whether GCC is compiler 2015-02-18 06:10:43 -05:00
Matthew Honnibal
59b46e4c2f * Move libc++ argument back under check for darwin. This assumes that extensions on OSX will be built with clang, but OSX GCC builds are also possible. Need to detect compiler and disable this flag 2015-02-18 06:03:45 -05:00
Matthew Honnibal
aa475673ee * Tweak compile args for OSX 2015-02-18 05:41:11 -05:00
Matthew Honnibal
b4edd1d907 * Make new compile args conditional on darwin, as they're invalid on Linux 2015-02-18 05:09:50 -05:00
Matthew Honnibal
e885903dc6 * Add compile args to fix conda compilation on OSX, and increment version 2015-02-18 05:01:27 -05:00
Matthew Honnibal
69d27d55b0 * Inc version, with new orphan-token bug fix 2015-02-16 16:52:54 -05:00
Matthew Honnibal
cae077b583 * Work on fixing orphaned Token objects bug 2015-02-16 15:20:31 -05:00
Matthew Honnibal
789a6fe462 * Inc version --- 0.63 seems to have been packaged incorrectly, to not include a bug fix to tokens.pyx to transfer ownership to Token objects 2015-02-16 11:56:14 -05:00
Matthew Honnibal
9dbc31d72c * Add test from NSchrading 2015-02-16 11:49:31 -05:00
Matthew Honnibal
274b802830 * Fix docs bug 2015-02-11 20:07:39 -05:00
Matthew Honnibal
773d209405 * Inc version to 0.63 2015-02-11 18:39:41 -05:00
Matthew Honnibal
cd6367e404 * Fix cosine function in documentation 2015-02-11 18:08:19 -05:00
Matthew Honnibal
7572e31f5e * Pass ownership of C data to Token instances if Tokens object is being garbage-collected, but Token instances are staying alive. 2015-02-11 18:05:06 -05:00
Matthew Honnibal
db3f26a51b * Remove version note 2015-02-11 18:03:23 -05:00
Matthew Honnibal
4258b1490a * Improve API docs for Token 2015-02-11 18:03:06 -05:00
Matthew Honnibal
64645a1c2f * Improve docstring on English 2015-02-11 15:13:20 -05:00
Matthew Honnibal
f0a9d2cb9c * Inc version 2015-02-11 14:20:57 -05:00
Matthew Honnibal
594e50bd45 * Add option to download speech-parsing data set. 2015-02-11 14:20:29 -05:00
Matthew Honnibal
0b7e769211 * Add POS tags to support SWBD tag set 2015-02-11 14:08:28 -05:00
Matthew Honnibal
e425de6d2b Merge branch 'develop' of ssh://github.com/honnibal/spaCy into develop 2015-02-10 10:16:24 -05:00