Commit Graph

1071 Commits

Author SHA1 Message Date
Matthew Honnibal
f5f15a1ef2 * Tmp commit 2015-03-26 16:44:43 +01:00
Matthew Honnibal
10ed738df2 * Tmp commit 2015-03-26 16:44:43 +01:00
Matthew Honnibal
4f83c9b3d5 * Make costs label-sensitive 2015-03-26 16:44:43 +01:00
Matthew Honnibal
179b7eb0a7 * Specify parser transition system in language 2015-03-26 16:44:43 +01:00
Matthew Honnibal
8c883cef58 * Refactored transition system code now compiling. Still need to hook up label oracle, and test 2015-03-26 16:44:43 +01:00
Matthew Honnibal
6e86790a4e * Add new syntax modules to setup.py 2015-03-26 16:44:42 +01:00
Matthew Honnibal
34215de61b * Upd train script, moving lots of functionality to new GoldParse class 2015-03-26 16:44:42 +01:00
Matthew Honnibal
f0159ab4b6 * Add file to hold GoldParse class 2015-03-26 16:44:42 +01:00
Matthew Honnibal
8eadb984cb * Refactor arc_eager to use new TransitionSystem base class. Need to fix oracle 2015-03-26 16:44:42 +01:00
Matthew Honnibal
b063001596 * Add base TransitionSystem class. Still need to rethink how non-monotonic labelling will work for best_valid 2015-03-26 16:44:42 +01:00
Matthew Honnibal
01bc4d6815 * Add set_parse method, to assign parse to tokens in a less hacky way. 2015-03-26 16:44:42 +01:00
Matthew Honnibal
dc986dbc0b * Work on refactored parser, where TransitionSystem can be easily subclassed 2015-03-26 16:44:42 +01:00
Matthew Honnibal
1cc6329b18 * Add base class to do transitions 2015-03-26 16:44:42 +01:00
Matthew Honnibal
135756ac3d * Tmp commit of NER refactoring 2015-03-26 16:44:42 +01:00
Matthew Honnibal
49df1b7002 * Ignore .tgz files 2015-03-26 16:44:42 +01:00
Matthew Honnibal
8715101239 * Merge changes from stash 2015-03-26 16:44:42 +01:00
Matthew Honnibal
23c1f6fc04 * Merge changes from stash 2015-03-26 16:44:41 +01:00
Matthew Honnibal
0ff078876a * Commit some work on ner.yx done on the plane 2015-03-26 16:44:41 +01:00
Matthew Honnibal
d81b7be6a2 * Merge train.py 2015-03-26 16:44:41 +01:00
Matthew Honnibal
3a302ae6f2 * Merge train.py 2015-03-26 16:44:41 +01:00
Matthew Honnibal
2e3dc3dfe2 * Merge changes in tokens.pyx 2015-03-26 16:44:41 +01:00
Matthew Honnibal
8cc3524dc9 * Ws 2015-03-26 16:44:41 +01:00
Matthew Honnibal
3d0570685c * Add NER transition system 2015-03-26 16:44:41 +01:00
Matthew Honnibal
043b758cf4 * Resurrect old NER code. This version won't be the one that runs; we want to re-use the parser code. But for now this is a useful reference. 2015-03-26 16:44:41 +01:00
Matthew Honnibal
b139aa92ba * Start setting out how NER will be implemented in the data model 2015-03-26 16:44:41 +01:00
Matthew Honnibal
0962ffc095 * Fix issue #37: missing check_flag attribute from Token class 2015-03-26 15:06:26 +01:00
Matthew Honnibal
5032f2a5c7 * Fix nested lists 2015-03-25 14:38:59 +01:00
Matthew Honnibal
03636be9da * Fix table md 2015-03-25 14:36:12 +01:00
Matthew Honnibal
2a39e87891 * Fix table md 2015-03-25 14:35:42 +01:00
Matthew Honnibal
9937f73075 * Fix table md 2015-03-25 14:34:36 +01:00
Matthew Honnibal
22368706ce * Add CLA stuff 2015-03-25 14:32:50 +01:00
Matthew Honnibal
46e936adfa * Fix quickstart 2015-03-19 00:09:39 -04:00
Matthew Honnibal
d345f53dbc * Add bootstrap script to install instructions 2015-03-16 14:14:00 -04:00
Matthew Honnibal
b924a4d642 * Add bootstrap script 2015-03-16 14:01:36 -04:00
Matthew Honnibal
2e8d0e5d45 * Upd download script 2015-03-03 05:47:16 -05:00
Matthew Honnibal
c341bfb0a2 * Inc version 2015-03-03 05:46:14 -05:00
Matthew Honnibal
a61dacb4e5 * Add tests for new subtree method 2015-03-03 05:41:00 -05:00
Matthew Honnibal
053814ffc8 * Report LAS in train script 2015-03-03 04:35:11 -05:00
Matthew Honnibal
b07632a9ef * Upd docs, improving description of parse tree navigation 2015-03-03 04:34:33 -05:00
Matthew Honnibal
dbe26f5793 * Add children and subtree methods to Token, which are generators to assist parse-tree navigation. 2015-03-03 04:18:41 -05:00
Matthew Honnibal
827a2337b0 * Inc version 2015-02-27 03:56:54 -05:00
Matthew Honnibal
ea90d136e8 * Fix bug in labelled parsing, that caused an 8% drop in labelled accuracy. 2015-02-27 03:56:10 -05:00
Matthew Honnibal
5e27bd0c4c * Add en language data, for tokenizer etc 2015-02-25 17:10:32 -05:00
Matthew Honnibal
1019939c7a * Whitespace 2015-02-24 23:03:02 -05:00
Matthew Honnibal
74015da94b * Inc version 2015-02-23 15:40:41 -05:00
Matthew Honnibal
caf046b220 * Hastily add method to apply tags from a list of strings, instead of predicting the tags. 2015-02-23 15:40:17 -05:00
Matthew Honnibal
6102360111 * Add -Wno-strict-prototypes, to suppress warning 2015-02-21 20:04:37 -05:00
Matthew Honnibal
47a4371fea * Upd tokenizer with i.e. tests 2015-02-18 06:37:04 -05:00
Matthew Honnibal
ba1d3ddd7f * Move -lc++ link arg to only be used if darwin is OS. Should actually check whether GCC is compiler 2015-02-18 06:10:43 -05:00
Matthew Honnibal
59b46e4c2f * Move libc++ argument back under check for darwin. This assumes that extensions on OSX will be built with clang, but OSX GCC builds are also possible. Need to detect compiler and disable this flag 2015-02-18 06:03:45 -05:00