Commit Graph

  • bcd038e7b6 * Implement HastyModel Matthew Honnibal 2014-12-31 01:16:47 +1100
  • e361f18ce9 * Fix docs command in fabfile Matthew Honnibal 2014-12-30 23:27:30 +1100
  • 1a075f77ff * Don't over-ride pre-loaded POS tags, if set by special-cases Matthew Honnibal 2014-12-30 23:26:32 +1100
  • 785c7ba76a * Embed signature on attrs Matthew Honnibal 2014-12-30 23:25:31 +1100
  • 30e5805656 * Lazy-load tagger and parser Matthew Honnibal 2014-12-30 23:25:09 +1100
  • 9976aa976e * Messily fix morphology and POS tags on special tokens. Matthew Honnibal 2014-12-30 23:24:37 +1100
  • 81d878beb2 * Upd tests Matthew Honnibal 2014-12-30 21:34:09 +1100
  • c1ef3febee * Embedsignature in tokens.pyx Matthew Honnibal 2014-12-30 21:22:00 +1100
  • aac5028b6e * Move tagger to _ml Matthew Honnibal 2014-12-30 21:21:38 +1100
  • 1ffb0229ed * Import tokens in parser.pxd Matthew Honnibal 2014-12-30 21:21:17 +1100
  • a04e164a37 * Move tagger.pyx to _ml.pyx Matthew Honnibal 2014-12-30 21:20:55 +1100
  • cdc1a27104 * Update docs Matthew Honnibal 2014-12-30 21:20:34 +1100
  • bb0b00f819 * Repurporse the Tagger class as a generic Model, wrapping thinc's interface Matthew Honnibal 2014-12-30 21:20:15 +1100
  • fe2a5e0370 * Work on docstrings Matthew Honnibal 2014-12-27 21:46:04 +1100
  • 6352e3e2a2 * Work on API reference Matthew Honnibal 2014-12-27 18:45:47 +1100
  • bb80937544 * Upd docstrings Matthew Honnibal 2014-12-27 18:45:16 +1100
  • 91a5064b7f * Upd tests Matthew Honnibal 2014-12-26 14:26:27 +1100
  • b8b65903fc * Tmp Matthew Honnibal 2014-12-24 17:42:00 +1100
  • 75a6930ad9 * Fix results table Matthew Honnibal 2014-12-24 14:35:32 +1100
  • a68ecc50fa * Ignore cpp files within en dir Matthew Honnibal 2014-12-23 15:19:01 +1100
  • ab61673edd * Fix api of array method Matthew Honnibal 2014-12-23 15:18:48 +1100
  • ed0ff63c09 * Compile attrs and parser in setup Matthew Honnibal 2014-12-23 15:18:20 +1100
  • 9dda8b4500 * Play with examples in index.rst Matthew Honnibal 2014-12-23 15:17:56 +1100
  • 7708d0e24a * Move lemmatizer to en dir Matthew Honnibal 2014-12-23 15:16:57 +1100
  • 98eb4c0426 * Fix path to parser model Matthew Honnibal 2014-12-23 15:09:09 +1100
  • b00bc01d8c * All tests now passing for reorg Matthew Honnibal 2014-12-23 13:18:59 +1100
  • 73f200436f * Tests passing except for morphology/lemmatization stuff Matthew Honnibal 2014-12-23 11:40:32 +1100
  • cf8d26c3d2 * POS tagger training working after reorg Matthew Honnibal 2014-12-22 08:54:47 +1100
  • 4c4aa2c5c9 * Work on train Matthew Honnibal 2014-12-22 07:25:43 +1100
  • 4d4d2c0db4 * Upd test Matthew Honnibal 2014-12-21 21:05:28 +1100
  • d047dc0d0f Upd lemmatizer test Matthew Honnibal 2014-12-21 21:02:44 +1100
  • b864f0e539 * Upd iteration test Matthew Honnibal 2014-12-21 21:01:46 +1100
  • 61df50b598 * Add English-subclass POS tagger Matthew Honnibal 2014-12-21 20:59:07 +1100
  • c1ab134159 * Upd lemmas test Matthew Honnibal 2014-12-21 20:58:21 +1100
  • 82bd57c76f * Upd intern test Matthew Honnibal 2014-12-21 20:44:21 +1100
  • 734d1da55c * Upd emoticons test Matthew Honnibal 2014-12-21 20:43:27 +1100
  • 199025609f * Upd contractions test Matthew Honnibal 2014-12-21 20:41:13 +1100
  • 0d9972f4b0 * Upd tokenizer test Matthew Honnibal 2014-12-21 20:38:27 +1100
  • 69e3a07fa1 * More index.rst fiddling Matthew Honnibal 2014-12-21 17:40:12 +1100
  • 9f3f07cab6 * Add attrs file for English Matthew Honnibal 2014-12-21 11:29:11 +1100
  • 2a89d70429 * Add vocab.pyx to setup, and ensure we can import spacy.en.lang Matthew Honnibal 2014-12-21 06:03:53 +1100
  • b34a1325d3 * Everything compiling after reorg. About to start testing. Matthew Honnibal 2014-12-21 05:42:23 +1100
  • e1c1a4b868 * Tmp Matthew Honnibal 2014-12-21 05:36:29 +1100
  • d11c1edf8c * Import slice_unicode from strings.pyx Matthew Honnibal 2014-12-20 07:56:26 +1100
  • be1bdcbd85 * Move lang.pyx to tokenizer.pyx Matthew Honnibal 2014-12-20 07:54:49 +1100
  • 89a1cc1a48 * Move murmurhash to .pxd in strings file Matthew Honnibal 2014-12-20 07:41:08 +1100
  • d5a942c4a4 * Rename lang.pyx to tokenizer.pyx Matthew Honnibal 2014-12-20 07:30:39 +1100
  • a60ae261ae * Move tokenizer to its own file, and refactor Matthew Honnibal 2014-12-20 07:29:16 +1100
  • 867a4a000c * Export set_morph_from_dict function Matthew Honnibal 2014-12-20 07:28:27 +1100
  • 4e30195c6d * Refactor morphology.pyx Matthew Honnibal 2014-12-20 07:27:28 +1100
  • 4c6ce7ee84 * Update tokens.pyx as part of reorg Matthew Honnibal 2014-12-20 07:03:26 +1100
  • 116f7f3bc1 * Rename Lexicon to Vocab, and move it to its own file Matthew Honnibal 2014-12-20 06:54:03 +1100
  • 780cbd68b1 * Move all struct definitions to structs.pxd, to avoid circular dependencies Matthew Honnibal 2014-12-20 06:51:33 +1100
  • f6556d8e5d * Refactor, move Lexeme struct to structs.pxd Matthew Honnibal 2014-12-20 06:51:03 +1100
  • 7d48bba6c4 * Move StringStore class to its own file Matthew Honnibal 2014-12-20 06:42:01 +1100
  • e15b9da7db * Pin preshed to a particular version Matthew Honnibal 2014-12-20 04:01:32 +1100
  • ed2fff6128 * Add tests Matthew Honnibal 2014-12-20 03:51:25 +1100
  • b066102d2d * Remove POS cache for now Matthew Honnibal 2014-12-20 03:49:32 +1100
  • ff252dd535 * Clean up 'guess_cache' idea, which didnt work well enough Matthew Honnibal 2014-12-20 03:48:51 +1100
  • 9d3ca13909 * Start work on parse-tree iteration classes Matthew Honnibal 2014-12-20 03:48:10 +1100
  • bed680c632 * Remove commented-out features Matthew Honnibal 2014-12-20 03:47:32 +1100
  • 3d178c03ae * Prune the features a bit Matthew Honnibal 2014-12-20 02:46:14 +1100
  • a0408e1758 * Working DecisionMemory class Matthew Honnibal 2014-12-20 01:43:26 +1100
  • 7920ea72b4 * Working parser with the decision memory idea. Disabling that for now, for simplicity Matthew Honnibal 2014-12-20 01:43:15 +1100
  • a2f2a48da9 * Add some extra features Matthew Honnibal 2014-12-20 01:42:24 +1100
  • 8fd9762d91 * Start laying out parse tree iteration methods Matthew Honnibal 2014-12-20 01:42:09 +1100
  • 53b8bc1f3c * Work on implementing a trainable cache for the parser. So far, doesn't improve efficiency Matthew Honnibal 2014-12-19 09:30:50 +1100
  • 033d6c9ac2 * Adapt POS tagger decision-memory for use in parser Matthew Honnibal 2014-12-19 07:23:04 +1100
  • 809ddf7887 * Add index.pxd Matthew Honnibal 2014-12-19 07:23:00 +1100
  • 1879abd16a * Set const-correctness for tagger Matthew Honnibal 2014-12-18 20:41:52 +1100
  • f72243b156 * Set const-correctness for Feature* array Matthew Honnibal 2014-12-18 20:41:32 +1100
  • 6ab7e40590 * Add non-monotonic parsing with cost-sensitive update. 92.26 on Y&M set Matthew Honnibal 2014-12-18 11:33:25 +1100
  • 7e0c692daf * Automatically push when the stack is empty Matthew Honnibal 2014-12-18 09:16:10 +1100
  • 61142a8eff * Tweak features Matthew Honnibal 2014-12-18 09:15:03 +1100
  • e3b123e6e0 * Ignore cpp files from parser Matthew Honnibal 2014-12-18 09:05:51 +1100
  • 8446ebfbbb * Work on parser. Up to 92 UAS on YM labels Matthew Honnibal 2014-12-18 09:05:31 +1100
  • 55de747bfc * Remove .cpp files Matthew Honnibal 2014-12-18 02:43:13 +1100
  • 4448a840f7 * Work on greedy parsing. Scoring about 91.2 Matthew Honnibal 2014-12-18 02:42:55 +1100
  • 87e9487d76 * Work on parser Matthew Honnibal 2014-12-17 21:10:12 +1100
  • 9d7d97978d * Work on greedy parser Matthew Honnibal 2014-12-17 21:09:29 +1100
  • d524dd306a * Work on greedy parser Matthew Honnibal 2014-12-17 03:19:43 +1100
  • 95ccea03b2 * Work on greedy parser Matthew Honnibal 2014-12-16 22:44:43 +1100
  • a432862fde * Add exception type to _arg_max_among in tagger Matthew Honnibal 2014-12-16 09:44:19 +1100
  • 9e00798820 * Work on integrating a greedy dependency parser Matthew Honnibal 2014-12-16 08:06:04 +1100
  • 24ffc32f2f * Another redraft of index.rst Matthew Honnibal 2014-12-15 16:32:03 +1100
  • 77dd7a212a * More thoughts on intro Matthew Honnibal 2014-12-15 09:19:29 +1100
  • 792802b2b9 * POS tag memoisation working, with good speed-up Matthew Honnibal 2014-12-12 14:33:51 +1100
  • ca54d58638 * Merge setup.py Matthew Honnibal 2014-12-10 15:21:27 +1100
  • 9959a64f7b * Working morphology and lemmatisation. POS tagging quite fast. Matthew Honnibal 2014-12-10 08:09:32 +1100
  • 7831b06610 * Compile morphology.pyx file Matthew Honnibal 2014-12-10 08:09:13 +1100
  • df3be14987 * Add pos_type features to POS tagger Matthew Honnibal 2014-12-10 08:08:55 +1100
  • 42973c4b37 * Improve efficiency of tagger, and improve morphological processing Matthew Honnibal 2014-12-10 01:02:04 +1100
  • 6b34a2f34b * Move morphological analysis into its own module, morphology.pyx Matthew Honnibal 2014-12-09 21:16:17 +1100
  • b962fe73d7 * Make suffixes file use full-power regex, so that we can handle periods properly Matthew Honnibal 2014-12-09 19:04:27 +1100
  • accdbe989b * Remove Tokens.extend method Matthew Honnibal 2014-12-09 17:09:23 +1100
  • 495e1c7366 * Use fused type in Tokens.push_back, simplifying the use of the cache Matthew Honnibal 2014-12-09 16:50:01 +1100
  • 516f0f1e14 * Remove test for loading ad hoc rules format Matthew Honnibal 2014-12-09 16:08:45 +1100
  • 6369835306 * Add false positive test for emoticons Matthew Honnibal 2014-12-09 16:08:17 +1100
  • f15deaad5b * Upd docs Matthew Honnibal 2014-12-09 16:08:01 +1100
  • 1ccabc806e * Work on lemmatization Matthew Honnibal 2014-12-09 16:06:18 +1100