Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							e2136232f9 
							
						 
					 
					
						
						
							
							Exclude states with no matching gold annotations from parsing  
						
						 
						
						
						
					 
					
						2017-05-22 10:30:12 -05:00  
					
					
						 
						
							
							
							 
						
					 
				 
			
				
					
						
							
							
								 
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							8b04b0af9f 
							
						 
					 
					
						
						
							
							Remove freqs from transition_system  
						
						 
						
						
						
					 
					
						2017-05-20 02:20:48 -05:00  
					
					
						 
						
							
							
							 
						
					 
				 
			
				
					
						
							
							
								 
								ines 
							
						 
					 
					
						
						
						
						
							
						
						
							0739ae7b76 
							
						 
					 
					
						
						
							
							Tidy up and fix formatting and imports  
						
						 
						
						
						
					 
					
						2017-04-15 13:05:15 +02:00  
					
					
						 
						
							
							
							 
						
					 
				 
			
				
					
						
							
							
								 
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							354458484c 
							
						 
					 
					
						
						
							
							WIP on add_label bug during NER training  
						
						 
						
						... 
						
						
						
						Currently when a new label is introduced to NER during training,
it causes the labels to be read in in an unexpected order. This
invalidates the model. 
						
					 
					
						2017-04-14 23:52:17 +02:00  
					
					
						 
						
							
							
							 
						
					 
				 
			
				
					
						
							
							
								 
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							2611ac2a89 
							
						 
					 
					
						
						
							
							Fix scorer bug for NER, related to ambiguity between missing annotations and misaligned tokens  
						
						 
						
						
						
					 
					
						2017-03-16 09:38:28 -05:00  
					
					
						 
						
							
							
							 
						
					 
				 
			
				
					
						
							
							
								 
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							931feb3360 
							
						 
					 
					
						
						
							
							Allow beam parsing for NER  
						
						 
						
						
						
					 
					
						2017-03-11 11:12:01 -06:00  
					
					
						 
						
							
							
							 
						
					 
				 
			
				
					
						
							
							
								 
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							159e8c46e1 
							
						 
					 
					
						
						
							
							Merge old training fixes with newer state  
						
						 
						
						
						
					 
					
						2016-11-25 09:16:36 -06:00  
					
					
						 
						
							
							
							 
						
					 
				 
			
				
					
						
							
							
								 
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							39341598bb 
							
						 
					 
					
						
						
							
							Fix NER label calculation  
						
						 
						
						
						
					 
					
						2016-11-25 09:02:22 -06:00  
					
					
						 
						
							
							
							 
						
					 
				 
			
				
					
						
							
							
								 
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							301f3cc898 
							
						 
					 
					
						
						
							
							Fix Issue  #429 . Add an initialize_state method to the named entity recogniser that adds missing entity types. This is a messy place to add this, because it's strange to have the method mutate state. A better home for this logic could be found.  
						
						 
						
						
						
					 
					
						2016-10-27 18:01:55 +02:00  
					
					
						 
						
							
							
							 
						
					 
				 
			
				
					
						
							
							
								 
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							f787cd29fe 
							
						 
					 
					
						
						
							
							Refactor the pipeline classes to make them more consistent, and remove the redundant blank() constructor.  
						
						 
						
						
						
					 
					
						2016-10-16 21:34:57 +02:00  
					
					
						 
						
							
							
							 
						
					 
				 
			
				
					
						
							
							
								 
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							9e09b39b9f 
							
						 
					 
					
						
						
							
							Revert "Changes to transition systems for new StringStore scheme"  
						
						 
						
						... 
						
						
						
						This reverts commit 0442e0ab1e . 
						
					 
					
						2016-09-30 20:11:49 +02:00  
					
					
						 
						
							
							
							 
						
					 
				 
			
				
					
						
							
							
								 
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							0442e0ab1e 
							
						 
					 
					
						
						
							
							Changes to transition systems for new StringStore scheme  
						
						 
						
						
						
					 
					
						2016-09-30 19:58:51 +02:00  
					
					
						 
						
							
							
							 
						
					 
				 
			
				
					
						
							
							
								 
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							a47f00901b 
							
						 
					 
					
						
						
							
							* Pass a StateC pointer into the transition and validation methods in the parser, so that the GIL can be released over a batch of documents  
						
						 
						
						
						
					 
					
						2016-02-01 02:58:14 +01:00  
					
					
						 
						
							
							
							 
						
					 
				 
			
				
					
						
							
							
								 
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							daaad66448 
							
						 
					 
					
						
						
							
							* Now fully proxied  
						
						 
						
						
						
					 
					
						2016-02-01 02:37:08 +01:00  
					
					
						 
						
							
							
							 
						
					 
				 
			
				
					
						
							
							
								 
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							10877a7791 
							
						 
					 
					
						
						
							
							* Update for thinc 5.0, including changing cost from int to weight_t, and updating the tagger and parser  
						
						 
						
						
						
					 
					
						2016-01-30 14:31:36 +01:00  
					
					
						 
						
							
							
							 
						
					 
				 
			
				
					
						
							
							
								 
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							c8e0011ebc 
							
						 
					 
					
						
						
							
							* Add iterators to the NER and parser transition systems, to get the action types  
						
						 
						
						
						
					 
					
						2016-01-19 19:07:43 +01:00  
					
					
						 
						
							
							
							 
						
					 
				 
			
				
					
						
							
							
								 
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							5623242b3e 
							
						 
					 
					
						
						
							
							* Adjust NER rules, so that U entries in gazetteer don't become B moves to the model  
						
						 
						
						
						
					 
					
						2015-11-12 04:48:23 +11:00  
					
					
						 
						
							
							
							 
						
					 
				 
			
				
					
						
							
							
								 
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							44fbdc7260 
							
						 
					 
					
						
						
							
							* Fix bug in NER transition system, that sometimes left no valid moves  
						
						 
						
						
						
					 
					
						2015-11-08 16:19:12 +01:00  
					
					
						 
						
							
							
							 
						
					 
				 
			
				
					
						
							
							
								 
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							e92371bb54 
							
						 
					 
					
						
						
							
							* Fix rule that made Last action invalid if there was a preset of O, since if the entity is already open, that ship has sailed.  
						
						 
						
						
						
					 
					
						2015-11-08 22:17:51 +11:00  
					
					
						 
						
							
							
							 
						
					 
				 
			
				
					
						
							
							
								 
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							af70dc166a 
							
						 
					 
					
						
						
							
							* Fix Last restriction, that was supposed to prevent conflicts with presets, but was incorrect.  
						
						 
						
						
						
					 
					
						2015-11-07 09:52:00 +11:00  
					
					
						 
						
							
							
							 
						
					 
				 
			
				
					
						
							
							
								 
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							d24b8509e4 
							
						 
					 
					
						
						
							
							* Correct screw ups from the previous commits  
						
						 
						
						
						
					 
					
						2015-11-07 06:51:41 +11:00  
					
					
						 
						
							
							
							 
						
					 
				 
			
				
					
						
							
							
								 
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							5efad178b5 
							
						 
					 
					
						
						
							
							* Set ent tag when close entity  
						
						 
						
						
						
					 
					
						2015-11-07 06:09:25 +11:00  
					
					
						 
						
							
							
							 
						
					 
				 
			
				
					
						
							
							
								 
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							01ab464383 
							
						 
					 
					
						
						
							
							* Prevent Begin and In moves from applying in NER if we're at the last token of a sentence, as this would mean the entity would span over a sentence boundary. Re Issue  #169  
						
						 
						
						
						
					 
					
						2015-11-07 05:30:44 +11:00  
					
					
						 
						
							
							
							 
						
					 
				 
			
				
					
						
							
							
								 
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							fe43f8cf39 
							
						 
					 
					
						
						
							
							* Whitespace  
						
						 
						
						
						
					 
					
						2015-08-09 02:31:53 +02:00  
					
					
						 
						
							
							
							 
						
					 
				 
			
				
					
						
							
							
								 
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							59c3bf60a6 
							
						 
					 
					
						
						
							
							* Ensure entity recognizer doesn't over-write preset types  
						
						 
						
						
						
					 
					
						2015-08-06 16:09:08 +02:00  
					
					
						 
						
							
							
							 
						
					 
				 
			
				
					
						
							
							
								 
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							9c1724ecae 
							
						 
					 
					
						
						
							
							* Gazetteer stuff working, now need to wire up to API  
						
						 
						
						
						
					 
					
						2015-08-06 00:35:40 +02:00  
					
					
						 
						
							
							
							 
						
					 
				 
			
				
					
						
							
							
								 
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							d5255aad77 
							
						 
					 
					
						
						
							
							* Update freqs for missing tags in ner, for serializer  
						
						 
						
						
						
					 
					
						2015-07-23 01:17:11 +02:00  
					
					
						 
						
							
							
							 
						
					 
				 
			
				
					
						
							
							
								 
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							317cbbc015 
							
						 
					 
					
						
						
							
							* Serialization round trip now working with decent API, but with rough spots in the organisation and requiring vocabulary to be fixed ahead of time.  
						
						 
						
						
						
					 
					
						2015-07-19 15:18:17 +02:00  
					
					
						 
						
							
							
							 
						
					 
				 
			
				
					
						
							
							
								 
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							75aeccc064 
							
						 
					 
					
						
						
							
							* Rejig parser interface to use new thinc.api.Example class, in prep of theano model. Comment out beam search  
						
						 
						
						
						
					 
					
						2015-06-28 11:02:34 +02:00  
					
					
						 
						
							
							
							 
						
					 
				 
			
				
					
						
							
							
								 
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							579735a095 
							
						 
					 
					
						
						
							
							* Remove import of _state module  
						
						 
						
						
						
					 
					
						2015-06-23 17:25:08 +02:00  
					
					
						 
						
							
							
							 
						
					 
				 
			
				
					
						
							
							
								 
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							15e177d7a1 
							
						 
					 
					
						
						
							
							* Fixes to unshift/fast-forward strategy. Getting 91.55 greedy on NW dev, gold preproc  
						
						 
						
						
						
					 
					
						2015-06-12 01:50:23 +02:00  
					
					
						 
						
							
							
							 
						
					 
				 
			
				
					
						
							
							
								 
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							e2f9a80713 
							
						 
					 
					
						
						
							
							* Remove old _state imports  
						
						 
						
						
						
					 
					
						2015-06-10 07:09:17 +02:00  
					
					
						 
						
							
							
							 
						
					 
				 
			
				
					
						
							
							
								 
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							18cc326dc0 
							
						 
					 
					
						
						
							
							* Bug fixes to ner.pyx  
						
						 
						
						
						
					 
					
						2015-06-10 06:57:41 +02:00  
					
					
						 
						
							
							
							 
						
					 
				 
			
				
					
						
							
							
								 
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							d68c686ec1 
							
						 
					 
					
						
						
							
							* Move StateClass into interface of transition functions  
						
						 
						
						
						
					 
					
						2015-06-10 01:35:28 +02:00  
					
					
						 
						
							
							
							 
						
					 
				 
			
				
					
						
							
							
								 
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							4b98b3e9c8 
							
						 
					 
					
						
						
							
							* Cost functions now take StateClass argument, instead of State*.  
						
						 
						
						
						
					 
					
						2015-06-10 00:40:43 +02:00  
					
					
						 
						
							
							
							 
						
					 
				 
			
				
					
						
							
							
								 
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							e0cf61f591 
							
						 
					 
					
						
						
							
							* Move StateClass into the interface for is_valid  
						
						 
						
						
						
					 
					
						2015-06-09 23:23:28 +02:00  
					
					
						 
						
							
							
							 
						
					 
				 
			
				
					
						
							
							
								 
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							1fee7ade61 
							
						 
					 
					
						
						
							
							* Tweak to ner  
						
						 
						
						
						
					 
					
						2015-06-05 23:48:43 +02:00  
					
					
						 
						
							
							
							 
						
					 
				 
			
				
					
						
							
							
								 
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							33e70b167f 
							
						 
					 
					
						
						
							
							* Remove dead code from ner.pyx  
						
						 
						
						
						
					 
					
						2015-06-05 17:12:47 +02:00  
					
					
						 
						
							
							
							 
						
					 
				 
			
				
					
						
							
							
								 
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							0114e7600d 
							
						 
					 
					
						
						
							
							* Fix NER oracle  
						
						 
						
						
						
					 
					
						2015-06-05 17:11:26 +02:00  
					
					
						 
						
							
							
							 
						
					 
				 
			
				
					
						
							
							
								 
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							6bf35cecc3 
							
						 
					 
					
						
						
							
							* Refactor transition system to use classes with staticmethods.  
						
						 
						
						
						
					 
					
						2015-06-05 02:27:17 +02:00  
					
					
						 
						
							
							
							 
						
					 
				 
			
				
					
						
							
							
								 
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							a513ec500f 
							
						 
					 
					
						
						
							
							* Have oracle functions take a struct instead of a Python object  
						
						 
						
						
						
					 
					
						2015-06-02 20:01:06 +02:00  
					
					
						 
						
							
							
							 
						
					 
				 
			
				
					
						
							
							
								 
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							0786d9b3c7 
							
						 
					 
					
						
						
							
							* Refactor TransitionSystem, adding set_valid method  
						
						 
						
						
						
					 
					
						2015-06-02 18:38:07 +02:00  
					
					
						 
						
							
							
							 
						
					 
				 
			
				
					
						
							
							
								 
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							c7876aa8b6 
							
						 
					 
					
						
						
							
							* Add get_valid method  
						
						 
						
						
						
					 
					
						2015-06-01 23:06:00 +02:00  
					
					
						 
						
							
							
							 
						
					 
				 
			
				
					
						
							
							
								 
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							76300bbb1b 
							
						 
					 
					
						
						
							
							* Use updated JSON format, with sentences below paragraphs. Allows use of gold preprocessing flag.  
						
						 
						
						
						
					 
					
						2015-05-30 01:25:46 +02:00  
					
					
						 
						
							
							
							 
						
					 
				 
			
				
					
						
							
							
								 
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							fc75210941 
							
						 
					 
					
						
						
							
							* Move spacy.syntax.conll to spacy.gold  
						
						 
						
						
						
					 
					
						2015-05-24 21:35:02 +02:00  
					
					
						 
						
							
							
							 
						
					 
				 
			
				
					
						
							
							
								 
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							20f1d868a3 
							
						 
					 
					
						
						
							
							* Tmp commit. Working on whole document parsing  
						
						 
						
						
						
					 
					
						2015-05-24 02:49:56 +02:00  
					
					
						 
						
							
							
							 
						
					 
				 
			
				
					
						
							
							
								 
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							aff9359a8d 
							
						 
					 
					
						
						
							
							* Update ner.pyx to expect brackets from gold_tuples  
						
						 
						
						
						
					 
					
						2015-05-12 20:27:55 +02:00  
					
					
						 
						
							
							
							 
						
					 
				 
			
				
					
						
							
							
								 
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							fb8d50b3d5 
							
						 
					 
					
						
						
							
							Merge branch 'master' of ssh://github.com/honnibal/spaCy  
						
						 
						
						
						
					 
					
						2015-04-30 12:45:15 +02:00  
					
					
						 
						
							
							
							 
						
					 
				 
			
				
					
						
							
							
								 
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							b3fd48c97b 
							
						 
					 
					
						
						
							
							* Fix missing root labels bug identified in Issue  #57  
						
						 
						
						
						
					 
					
						2015-04-28 20:45:51 +02:00  
					
					
						 
						
							
							
							 
						
					 
				 
			
				
					
						
							
							
								 
								Jordan Suchow 
							
						 
					 
					
						
						
						
						
							
						
						
							3a8d9b37a6 
							
						 
					 
					
						
						
							
							Remove trailing whitespace  
						
						 
						
						
						
					 
					
						2015-04-19 13:01:38 -07:00