| 
							
							
								 Matthew Honnibal | 6f47a667cf | * Move Span class to own file | 2015-03-26 16:45:38 +01:00 |  | 
			
				
					| 
							
							
								 Matthew Honnibal | f02c39dfaf | * Compare to is not None, for more robustness | 2015-03-26 16:44:48 +01:00 |  | 
			
				
					| 
							
							
								 Matthew Honnibal | 8f68b864c4 | * Move Span/Spans to separate files. Currently duplicates lots of Tokens functionality. Should probably be integrated into Tokens | 2015-03-26 16:44:48 +01:00 |  | 
			
				
					| 
							
							
								 Matthew Honnibal | 056c672caf | * Bug fixes to tokenization, and support for times | 2015-03-26 16:44:48 +01:00 |  | 
			
				
					| 
							
							
								 Matthew Honnibal | ee385b439a | * Ensure StringStore is dumped during training | 2015-03-26 16:44:47 +01:00 |  | 
			
				
					| 
							
							
								 Matthew Honnibal | e854ba0a13 | * Remove support for force_gold flag from GreedyParser, since it's not so useful, and it's clutter | 2015-03-26 16:44:47 +01:00 |  | 
			
				
					| 
							
							
								 Matthew Honnibal | 6a6085f8b9 | * Clean up GreedyParser.train function a bit | 2015-03-26 16:44:47 +01:00 |  | 
			
				
					| 
							
							
								 Matthew Honnibal | b3157927e6 | * Clean up unused feature templates | 2015-03-26 16:44:47 +01:00 |  | 
			
				
					| 
							
							
								 Matthew Honnibal | 411bf377d4 | * Remove dependency on ner_util module | 2015-03-26 16:44:47 +01:00 |  | 
			
				
					| 
							
							
								 Matthew Honnibal | 01c892f583 | * Add comment to fill_context | 2015-03-26 16:44:47 +01:00 |  | 
			
				
					| 
							
							
								 Matthew Honnibal | 2741179aff | * Important bug fix: Fill token N2w, which was being unfilled, after a bad edit while writing the NER features. | 2015-03-26 16:44:47 +01:00 |  | 
			
				
					| 
							
							
								 Matthew Honnibal | 2b2dec95d3 | * Add comment to set_parse | 2015-03-26 16:44:47 +01:00 |  | 
			
				
					| 
							
							
								 Matthew Honnibal | e770fade1e | * Don't set dependency labels in set_parse, as this may be used by the Entity recogniser instead. Need to clean this method up... | 2015-03-26 16:44:47 +01:00 |  | 
			
				
					| 
							
							
								 Matthew Honnibal | 71648205d9 | * Add support for debug feature set. Just use unigrams for this. | 2015-03-26 16:44:47 +01:00 |  | 
			
				
					| 
							
							
								 Matthew Honnibal | 3b70b304b2 | * Add words to gold_tuples from gold conll file | 2015-03-26 16:44:47 +01:00 |  | 
			
				
					| 
							
							
								 Matthew Honnibal | 2e12dec76e | * Adjust scorer to account for tokenization mistakes | 2015-03-26 16:44:47 +01:00 |  | 
			
				
					| 
							
							
								 Matthew Honnibal | 221f43c370 | * Ensure better separation between score printing and training in train.py | 2015-03-26 16:44:46 +01:00 |  | 
			
				
					| 
							
							
								 Matthew Honnibal | 6d49f8717b | * Move scoring away from training. Does not support scoring on gold preproc. | 2015-03-26 16:44:46 +01:00 |  | 
			
				
					| 
							
							
								 Matthew Honnibal | 05d6065e2e | * Add assertion | 2015-03-26 16:44:46 +01:00 |  | 
			
				
					| 
							
							
								 Matthew Honnibal | 377e9b29b1 | * Whitespace | 2015-03-26 16:44:46 +01:00 |  | 
			
				
					| 
							
							
								 Matthew Honnibal | 670959f40c | * Fix iteration order on Tokens.rights | 2015-03-26 16:44:46 +01:00 |  | 
			
				
					| 
							
							
								 Matthew Honnibal | 231ce2dae5 | * Assign ROOT label by default. May be papering over another bug. | 2015-03-26 16:44:46 +01:00 |  | 
			
				
					| 
							
							
								 Matthew Honnibal | 9f4ad8fdfb | * Assign root words the ROOT label via the Break transition. Something is still wrong here... | 2015-03-26 16:44:46 +01:00 |  | 
			
				
					| 
							
							
								 Matthew Honnibal | 52429625f0 | * Add write_parses function | 2015-03-26 16:44:46 +01:00 |  | 
			
				
					| 
							
							
								 Matthew Honnibal | 0c91dd9e15 | * Re-enable entity training | 2015-03-26 16:44:46 +01:00 |  | 
			
				
					| 
							
							
								 Matthew Honnibal | f729164c01 | * Fix bug in label assignment: ensure null-label transitions receive the label 0 | 2015-03-26 16:44:46 +01:00 |  | 
			
				
					| 
							
							
								 Matthew Honnibal | ee927fbbb4 | * Fix test_morph_exceptions | 2015-03-26 16:44:46 +01:00 |  | 
			
				
					| 
							
							
								 Matthew Honnibal | 7237c805c7 | * Load tag for specials.json token | 2015-03-26 16:44:46 +01:00 |  | 
			
				
					| 
							
							
								 Matthew Honnibal | 13520e6cf0 | * Add i.e. to specials.json | 2015-03-26 16:44:45 +01:00 |  | 
			
				
					| 
							
							
								 Matthew Honnibal | 567388e38d | * Use values encoded by StringStore in POS tagging, rather than indices into a list of tags | 2015-03-26 16:44:45 +01:00 |  | 
			
				
					| 
							
							
								 Matthew Honnibal | 3105c7f8ba | * Don't pass label_ids dict to Tokens, since we now use the StringStore to manage string-to-int mapping for labels | 2015-03-26 16:44:45 +01:00 |  | 
			
				
					| 
							
							
								 Matthew Honnibal | 27d9df49e7 | * Upd sbd tests | 2015-03-26 16:44:45 +01:00 |  | 
			
				
					| 
							
							
								 Matthew Honnibal | 801bf14f4f | * Clean up handling of dep_strings and ent_strings, using StringStore to encode the label names. | 2015-03-26 16:44:45 +01:00 |  | 
			
				
					| 
							
							
								 Matthew Honnibal | 9061bbaf61 | * Move to fixing up ent_strings and dep_strings passing | 2015-03-26 16:44:45 +01:00 |  | 
			
				
					| 
							
							
								 Matthew Honnibal | 31fad99518 | * Use StringStore to encode label names, instead of label_ids | 2015-03-26 16:44:45 +01:00 |  | 
			
				
					| 
							
							
								 Matthew Honnibal | 64db61bff1 | * Add Span class to Python API | 2015-03-26 16:44:45 +01:00 |  | 
			
				
					| 
							
							
								 Matthew Honnibal | b9b695fb1b | * Remove debug word list | 2015-03-26 16:44:45 +01:00 |  | 
			
				
					| 
							
							
								 Matthew Honnibal | 8f7eeb1c2d | * Add verbose flag for Scorer, for debugging, and fix ent_strings bug | 2015-03-26 16:44:45 +01:00 |  | 
			
				
					| 
							
							
								 Matthew Honnibal | f21ab2d7fb | * Fix bug in ugly ent_strings hack on English class | 2015-03-26 16:44:45 +01:00 |  | 
			
				
					| 
							
							
								 Matthew Honnibal | 1c843934be | * Fix oracle bug in NER. Now getting 77% F on ontonotes | 2015-03-26 16:44:44 +01:00 |  | 
			
				
					| 
							
							
								 Matthew Honnibal | 903f196b3f | * Fix verbose printing for scorer | 2015-03-26 16:44:44 +01:00 |  | 
			
				
					| 
							
							
								 Matthew Honnibal | e181c051d5 | * Improve features for NER | 2015-03-26 16:44:44 +01:00 |  | 
			
				
					| 
							
							
								 Matthew Honnibal | 7ecb52c0ed | * Add scorer script | 2015-03-26 16:44:44 +01:00 |  | 
			
				
					| 
							
							
								 Matthew Honnibal | 8057a95f20 | * NER seems to be working, scoring 69 F. Need to add decision-history features --- currently only use current word, 2 words context. Need refactoring. | 2015-03-26 16:44:44 +01:00 |  | 
			
				
					| 
							
							
								 Matthew Honnibal | e99f19dd6c | * Fix clean function | 2015-03-26 16:44:44 +01:00 |  | 
			
				
					| 
							
							
								 Matthew Honnibal | ae235e07b9 | * Refactoring working for parser, but now need to rig up features for NER, and then debug oracle etc. | 2015-03-26 16:44:44 +01:00 |  | 
			
				
					| 
							
							
								 Matthew Honnibal | 4539c70542 | * Work on updating train script for named entity recognition | 2015-03-26 16:44:44 +01:00 |  | 
			
				
					| 
							
							
								 Matthew Honnibal | 357dcdcc01 | * Fix clean function | 2015-03-26 16:44:44 +01:00 |  | 
			
				
					| 
							
							
								 Matthew Honnibal | b3eda03c9c | * Tmp | 2015-03-26 16:44:44 +01:00 |  | 
			
				
					| 
							
							
								 Matthew Honnibal | 220ce8bfed | * Prepare English class for NER | 2015-03-26 16:44:44 +01:00 |  |