Commit Graph

922 Commits

Author SHA1 Message Date
Matthew Honnibal
77d62d0179 * Large refactor of Token objects, making them much thinner. This is to support fast parse-tree navigation. 2015-01-31 13:42:58 +11:00
Matthew Honnibal
88170e6295 * Supply dep_strings as a tuple, for the changed API on Tokens 2015-01-31 13:42:09 +11:00
Matthew Honnibal
0981d68022 * Set a sent_end flag during parsing, for later use 2015-01-31 13:41:46 +11:00
Matthew Honnibal
251dbf24d7 * Fix unintialised variable error 2015-01-30 20:46:34 +11:00
Matthew Honnibal
83a4df5a1a * Fix download script 2015-01-30 20:40:42 +11:00
Matthew Honnibal
6f9ebc2f34 * Fix download script 2015-01-30 20:33:19 +11:00
Matthew Honnibal
a2bed49ac7 * Upd travis.yml 2015-01-30 20:27:35 +11:00
Matthew Honnibal
8b85d0bb8a * Only download small data if no data dir exists 2015-01-30 20:27:14 +11:00
Matthew Honnibal
e03b1fea22 * Don't download full data during test 2015-01-30 20:12:33 +11:00
Matthew Honnibal
2da694f65e * Don't load parser in test_pre_punct 2015-01-30 20:11:47 +11:00
Matthew Honnibal
e88ceda0ab * Set PYTHONPATH in travis.yml 2015-01-30 19:48:34 +11:00
Matthew Honnibal
6c081dd1fc * Handle failure when numpy headers are already installed correctly 2015-01-30 19:48:19 +11:00
Matthew Honnibal
1a7a1c2771 * Fix Issue #16: tokens recurse when printing 2015-01-30 19:47:50 +11:00
Matthew Honnibal
cb95ef6934 * Fix download script 2015-01-30 19:28:43 +11:00
Matthew Honnibal
e578bd37bd * Fix download script 2015-01-30 18:59:31 +11:00
Matthew Honnibal
df52014d12 * Fix download script 2015-01-30 18:36:24 +11:00
Matthew Honnibal
f0bbffca8d * Fix the way numpy headers are installed during compilation from source 2015-01-30 18:14:45 +11:00
Matthew Honnibal
1ef0e04aa0 * Change travis.yml to build from source, instead of from PyPi. PyPi checking will be done in a different branch. 2015-01-30 18:06:02 +11:00
Matthew Honnibal
0f95712189 * Improve accuracy reporting during training 2015-01-30 18:05:06 +11:00
Matthew Honnibal
b68f563c2f * Fix Issue #14: Improve parsing API 2015-01-30 18:04:41 +11:00
Matthew Honnibal
998b607f65 * Upd download script, having it download all data if there's no data/ directory, allowing easier compilation from source 2015-01-30 18:04:01 +11:00
Matthew Honnibal
0b53fd7daa * Add test for parse tree navigation 2015-01-30 18:02:58 +11:00
Matthew Honnibal
ef2493a3bd * Upd gitignore 2015-01-30 16:49:44 +11:00
Matthew Honnibal
d5d1578e44 * Add manifest file 2015-01-30 16:49:02 +11:00
Matthew Honnibal
0a1ec40f76 * Add draft work on features 2015-01-30 16:46:52 +11:00
Matthew Honnibal
7d432b7e39 * Add tests for vector-space model 2015-01-30 16:45:45 +11:00
Matthew Honnibal
48b98e3fb5 * Add test for tag names 2015-01-30 16:45:11 +11:00
Matthew Honnibal
613a195f92 * Add test for indices 2015-01-30 16:44:29 +11:00
Matthew Honnibal
03cc2ee08e * Add test for numpy array transport 2015-01-30 16:43:55 +11:00
Matthew Honnibal
d20eeac156 * Start work on lexrank tutorial 2015-01-30 16:42:43 +11:00
Matthew Honnibal
b3f9b199cf Merge branch 'punctparse' 2015-01-30 16:38:56 +11:00
Matthew Honnibal
ca7577d8a9 * Allow parsers and taggers to be trained on text without gold pre-processing. 2015-01-30 16:36:24 +11:00
Matthew Honnibal
67d6e53a69 * Ensure parser and tagger function correctly when training from missing values, indicated by -1 2015-01-30 14:08:56 +11:00
Matthew Honnibal
4ff180db74 * Fix off-by-one error in commit 0a7fceb 2015-01-30 12:49:33 +11:00
Matthew Honnibal
d0e08a5b57 * Upd index tests 2015-01-30 12:35:13 +11:00
Matthew Honnibal
0a7fcebdf7 * Fix Issue #12: Incorrect token.idx calculations for some punctuation, in the presence of token cache 2015-01-30 12:33:38 +11:00
Matthew Honnibal
b38093237e * More debug prints 2015-01-30 11:15:54 +11:00
Matthew Honnibal
35a18250cc * Upd tests, avoiding unnecessary processing to make testing faster 2015-01-30 10:41:55 +11:00
Matthew Honnibal
5458f220f8 * Fix quickstart instructions 2015-01-30 10:31:25 +11:00
Matthew Honnibal
11ed65b93c * Work on alignment, for evaluation with non-gold preprocessing 2015-01-30 10:31:03 +11:00
Matthew Honnibal
ebf7d2fab1 * Use non-joint sbd, for more simplicity and fewer classes 2015-01-29 06:22:03 +11:00
Matthew Honnibal
d05c5bf141 * Remove comment 2015-01-29 05:19:27 +11:00
Matthew Honnibal
b4348ce1c3 * Messily use unsegmented sentences to train the parser 2015-01-29 04:21:13 +11:00
Matthew Honnibal
320b045daa * Oracle now consistent over gold standard derivation 2015-01-29 03:41:58 +11:00
Matthew Honnibal
f590382134 * Work on sbd 2015-01-29 03:18:29 +11:00
Matthew Honnibal
9e78d673d5 * Fix quickstart installation docs 2015-01-28 14:28:34 +11:00
Matthew Honnibal
fe5f34c37c Merge branch 'master' of ssh://github.com/honnibal/spaCy 2015-01-28 14:01:00 +11:00
Matthew Honnibal
781dd712dc * Fix numpy commit problem 2015-01-28 14:00:20 +11:00
Matthew Honnibal
b08c0ce54e * Fix numpy install problem 2015-01-28 13:58:33 +11:00
Matthew Honnibal
9171284d62 * Fix compile-from-source instructions 2015-01-28 12:27:44 +11:00