Matthew Honnibal
|
8057a95f20
|
* NER seems to be working, scoring 69 F. Need to add decision-history features --- currently only use current word, 2 words context. Need refactoring.
|
2015-03-26 16:44:44 +01:00 |
|
Matthew Honnibal
|
ae235e07b9
|
* Refactoring working for parser, but now need to rig up features for NER, and then debug oracle etc.
|
2015-03-26 16:44:44 +01:00 |
|
Matthew Honnibal
|
b3eda03c9c
|
* Tmp
|
2015-03-26 16:44:44 +01:00 |
|
Matthew Honnibal
|
220ce8bfed
|
* Prepare English class for NER
|
2015-03-26 16:44:44 +01:00 |
|
Matthew Honnibal
|
f5830dc1c1
|
* Remove _transitions.pyx
|
2015-03-26 16:44:44 +01:00 |
|
Matthew Honnibal
|
6865c2fb4d
|
* Fix assignment of dep strings in tokens.pyx
|
2015-03-26 16:44:43 +01:00 |
|
Matthew Honnibal
|
6b6bce9e7a
|
* Fix label loading for transition system
|
2015-03-26 16:44:43 +01:00 |
|
Matthew Honnibal
|
5278c7504b
|
* Hacks to conll.pyx. Should clean these up.
|
2015-03-26 16:44:43 +01:00 |
|
Matthew Honnibal
|
f321b2b2eb
|
* Remove TODO comment
|
2015-03-26 16:44:43 +01:00 |
|
Matthew Honnibal
|
fdabd93bfb
|
* Ensure high loss for invalid moves, and fix label reading for arc-eager
|
2015-03-26 16:44:43 +01:00 |
|
Matthew Honnibal
|
10ed738df2
|
* Tmp commit
|
2015-03-26 16:44:43 +01:00 |
|
Matthew Honnibal
|
4f83c9b3d5
|
* Make costs label-sensitive
|
2015-03-26 16:44:43 +01:00 |
|
Matthew Honnibal
|
179b7eb0a7
|
* Specify parser transition system in language
|
2015-03-26 16:44:43 +01:00 |
|
Matthew Honnibal
|
8c883cef58
|
* Refactored transition system code now compiling. Still need to hook up label oracle, and test
|
2015-03-26 16:44:43 +01:00 |
|
Matthew Honnibal
|
f0159ab4b6
|
* Add file to hold GoldParse class
|
2015-03-26 16:44:42 +01:00 |
|
Matthew Honnibal
|
8eadb984cb
|
* Refactor arc_eager to use new TransitionSystem base class. Need to fix oracle
|
2015-03-26 16:44:42 +01:00 |
|
Matthew Honnibal
|
b063001596
|
* Add base TransitionSystem class. Still need to rethink how non-monotonic labelling will work for best_valid
|
2015-03-26 16:44:42 +01:00 |
|
Matthew Honnibal
|
01bc4d6815
|
* Add set_parse method, to assign parse to tokens in a less hacky way.
|
2015-03-26 16:44:42 +01:00 |
|
Matthew Honnibal
|
dc986dbc0b
|
* Work on refactored parser, where TransitionSystem can be easily subclassed
|
2015-03-26 16:44:42 +01:00 |
|
Matthew Honnibal
|
1cc6329b18
|
* Add base class to do transitions
|
2015-03-26 16:44:42 +01:00 |
|
Matthew Honnibal
|
135756ac3d
|
* Tmp commit of NER refactoring
|
2015-03-26 16:44:42 +01:00 |
|
Matthew Honnibal
|
23c1f6fc04
|
* Merge changes from stash
|
2015-03-26 16:44:41 +01:00 |
|
Matthew Honnibal
|
0ff078876a
|
* Commit some work on ner.yx done on the plane
|
2015-03-26 16:44:41 +01:00 |
|
Matthew Honnibal
|
d81b7be6a2
|
* Merge train.py
|
2015-03-26 16:44:41 +01:00 |
|
Matthew Honnibal
|
2e3dc3dfe2
|
* Merge changes in tokens.pyx
|
2015-03-26 16:44:41 +01:00 |
|
Matthew Honnibal
|
8cc3524dc9
|
* Ws
|
2015-03-26 16:44:41 +01:00 |
|
Matthew Honnibal
|
3d0570685c
|
* Add NER transition system
|
2015-03-26 16:44:41 +01:00 |
|
Matthew Honnibal
|
043b758cf4
|
* Resurrect old NER code. This version won't be the one that runs; we want to re-use the parser code. But for now this is a useful reference.
|
2015-03-26 16:44:41 +01:00 |
|
Matthew Honnibal
|
b139aa92ba
|
* Start setting out how NER will be implemented in the data model
|
2015-03-26 16:44:41 +01:00 |
|
Matthew Honnibal
|
0962ffc095
|
* Fix issue #37: missing check_flag attribute from Token class
|
2015-03-26 15:06:26 +01:00 |
|
Matthew Honnibal
|
2e8d0e5d45
|
* Upd download script
|
2015-03-03 05:47:16 -05:00 |
|
Matthew Honnibal
|
dbe26f5793
|
* Add children and subtree methods to Token, which are generators to assist parse-tree navigation.
|
2015-03-03 04:18:41 -05:00 |
|
Matthew Honnibal
|
ea90d136e8
|
* Fix bug in labelled parsing, that caused an 8% drop in labelled accuracy.
|
2015-02-27 03:56:10 -05:00 |
|
Matthew Honnibal
|
caf046b220
|
* Hastily add method to apply tags from a list of strings, instead of predicting the tags.
|
2015-02-23 15:40:17 -05:00 |
|
Matthew Honnibal
|
cae077b583
|
* Work on fixing orphaned Token objects bug
|
2015-02-16 15:20:31 -05:00 |
|
Matthew Honnibal
|
7572e31f5e
|
* Pass ownership of C data to Token instances if Tokens object is being garbage-collected, but Token instances are staying alive.
|
2015-02-11 18:05:06 -05:00 |
|
Matthew Honnibal
|
64645a1c2f
|
* Improve docstring on English
|
2015-02-11 15:13:20 -05:00 |
|
Matthew Honnibal
|
594e50bd45
|
* Add option to download speech-parsing data set.
|
2015-02-11 14:20:29 -05:00 |
|
Matthew Honnibal
|
0b7e769211
|
* Add POS tags to support SWBD tag set
|
2015-02-11 14:08:28 -05:00 |
|
Matthew Honnibal
|
312b3a45f3
|
* Fix issue #19: Allow parsing/pos tagging of empty strings
|
2015-02-10 10:15:58 -05:00 |
|
Matthew Honnibal
|
2a0615104b
|
* Upd download script
|
2015-02-09 10:22:59 -05:00 |
|
Matthew Honnibal
|
5c3513583d
|
* Clear buffered python tokens when modifying the Tokens object. Need to clean this up, and modify via a method on Tokens.
|
2015-02-09 03:57:10 -05:00 |
|
Matthew Honnibal
|
be5536d239
|
* Fix Issue #22: PRP and PRP$ were mapped to NOUN. Should be PRON.
|
2015-02-08 18:36:18 -05:00 |
|
Matthew Honnibal
|
0492cee8b4
|
* Fix Issue #24: Lemmas are empty when the L field is missing for special-cased tokens
|
2015-02-08 18:30:30 -05:00 |
|
Matthew Honnibal
|
d229fbd228
|
* Give better error on out-of-bounds array access
|
2015-02-07 12:59:12 -05:00 |
|
Matthew Honnibal
|
ab8bb047d0
|
* Fix negative index for __getitem__
|
2015-02-07 12:58:46 -05:00 |
|
Matthew Honnibal
|
44c7eafe44
|
* Fix download.py
|
2015-02-07 12:00:36 -05:00 |
|
Matthew Honnibal
|
6ca7f2eedc
|
* Upd download script
|
2015-02-07 11:32:33 -05:00 |
|
Matthew Honnibal
|
f0e0588833
|
* Fill L2 norm attribute on LexemeC struct
|
2015-02-07 08:44:42 -05:00 |
|
Matthew Honnibal
|
75f9b7d6bf
|
* Add L2 norm field to LexemeC struct
|
2015-02-07 08:43:17 -05:00 |
|