Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							a6d8d7c82e 
							
						 
					 
					
						
						
							
							Add is_gold_parse method to transition system  
						
						
						
					 
					
						2017-08-16 18:24:09 -05:00 
						 
				 
			
				
					
						
							
							
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							3533bb61cb 
							
						 
					 
					
						
						
							
							Add option of 8 feature parse state  
						
						
						
					 
					
						2017-08-16 18:23:27 -05:00 
						 
				 
			
				
					
						
							
							
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							210f6d5175 
							
						 
					 
					
						
						
							
							Fix efficiency error in batch parse  
						
						
						
					 
					
						2017-08-15 03:19:03 -05:00 
						 
				 
			
				
					
						
							
							
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							23537a011d 
							
						 
					 
					
						
						
							
							Tweaks to beam parser  
						
						
						
					 
					
						2017-08-15 03:15:28 -05:00 
						 
				 
			
				
					
						
							
							
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							500e92553d 
							
						 
					 
					
						
						
							
							Fix memory error when copying scores in beam  
						
						
						
					 
					
						2017-08-15 03:15:04 -05:00 
						 
				 
			
				
					
						
							
							
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							a8e4064dd8 
							
						 
					 
					
						
						
							
							Fix tensor gradient in parser  
						
						
						
					 
					
						2017-08-15 03:14:36 -05:00 
						 
				 
			
				
					
						
							
							
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							e420e0366c 
							
						 
					 
					
						
						
							
							Remove use of hash function in beam parser  
						
						
						
					 
					
						2017-08-15 03:13:57 -05:00 
						 
				 
			
				
					
						
							
							
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							52c180ecf5 
							
						 
					 
					
						
						
							
							Revert "Merge branch 'develop' of  https://github.com/explosion/spaCy  into develop"  
						
						... 
						
						
						
						This reverts commit ea8de11ad508e443e083 
						
					 
					
						2017-08-14 13:00:23 +02:00 
						 
				 
			
				
					
						
							
							
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							0ae045256d 
							
						 
					 
					
						
						
							
							Fix beam training  
						
						
						
					 
					
						2017-08-13 18:02:05 -05:00 
						 
				 
			
				
					
						
							
							
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							6a42cc16ff 
							
						 
					 
					
						
						
							
							Fix beam parser, improve efficiency of non-beam  
						
						
						
					 
					
						2017-08-13 12:37:26 +02:00 
						 
				 
			
				
					
						
							
							
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							12de263813 
							
						 
					 
					
						
						
							
							Bug fixes to beam parsing. Learns small sample  
						
						
						
					 
					
						2017-08-13 09:33:39 +02:00 
						 
				 
			
				
					
						
							
							
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							17874fe491 
							
						 
					 
					
						
						
							
							Disable beam parsing  
						
						
						
					 
					
						2017-08-12 19:35:40 -05:00 
						 
				 
			
				
					
						
							
							
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							3e30712b62 
							
						 
					 
					
						
						
							
							Improve defaults  
						
						
						
					 
					
						2017-08-12 19:24:17 -05:00 
						 
				 
			
				
					
						
							
							
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							28e930aae0 
							
						 
					 
					
						
						
							
							Fixes for beam parsing. Not working  
						
						
						
					 
					
						2017-08-12 19:22:52 -05:00 
						 
				 
			
				
					
						
							
							
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							c96d769836 
							
						 
					 
					
						
						
							
							Fix beam parse. Not sure if working  
						
						
						
					 
					
						2017-08-12 18:21:54 -05:00 
						 
				 
			
				
					
						
							
							
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							4638f4b869 
							
						 
					 
					
						
						
							
							Fix beam update  
						
						
						
					 
					
						2017-08-12 17:15:16 -05:00 
						 
				 
			
				
					
						
							
							
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							d4308d2363 
							
						 
					 
					
						
						
							
							Initialize State offset to 0  
						
						
						
					 
					
						2017-08-12 17:14:39 -05:00 
						 
				 
			
				
					
						
							
							
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							b353e4d843 
							
						 
					 
					
						
						
							
							Work on parser beam training  
						
						
						
					 
					
						2017-08-12 14:47:45 -05:00 
						 
				 
			
				
					
						
							
							
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							cd5ecedf6a 
							
						 
					 
					
						
						
							
							Try drop_layer in parser  
						
						
						
					 
					
						2017-08-12 08:56:33 -05:00 
						 
				 
			
				
					
						
							
							
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							1a59db1c86 
							
						 
					 
					
						
						
							
							Fix dropout and learn rate in parser  
						
						
						
					 
					
						2017-08-12 05:44:39 -05:00 
						 
				 
			
				
					
						
							
							
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							d01dc3704a 
							
						 
					 
					
						
						
							
							Adjust parser model  
						
						
						
					 
					
						2017-08-09 20:06:33 -05:00 
						 
				 
			
				
					
						
							
							
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							f37528ef58 
							
						 
					 
					
						
						
							
							Pass embed size for parser fine-tune. Use SELU  
						
						
						
					 
					
						2017-08-09 17:52:53 -05:00 
						 
				 
			
				
					
						
							
							
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							bbace204be 
							
						 
					 
					
						
						
							
							Gate parser fine-tuning behind feature flag  
						
						
						
					 
					
						2017-08-09 16:40:42 -05:00 
						 
				 
			
				
					
						
							
							
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							dbdd8afc4b 
							
						 
					 
					
						
						
							
							Fix parser fine-tune training  
						
						
						
					 
					
						2017-08-08 15:46:07 -05:00 
						 
				 
			
				
					
						
							
							
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							88bf1cf87c 
							
						 
					 
					
						
						
							
							Update parser for fine tuning  
						
						
						
					 
					
						2017-08-08 15:34:17 -05:00 
						 
				 
			
				
					
						
							
							
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							42bd26f6f3 
							
						 
					 
					
						
						
							
							Give parser its own tok2vec weights  
						
						
						
					 
					
						2017-08-06 18:33:46 +02:00 
						 
				 
			
				
					
						
							
							
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							78498a072d 
							
						 
					 
					
						
						
							
							Return Transition for missing actions in lookup_action  
						
						
						
					 
					
						2017-08-06 14:16:36 +02:00 
						 
				 
			
				
					
						
							
							
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							bfffdeabb2 
							
						 
					 
					
						
						
							
							Fix parser batch-size bug introduced during cleanup  
						
						
						
					 
					
						2017-08-06 14:10:48 +02:00 
						 
				 
			
				
					
						
							
							
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							7f876a7a82 
							
						 
					 
					
						
						
							
							Clean up some unused code in parser  
						
						
						
					 
					
						2017-08-06 00:00:21 +02:00 
						 
				 
			
				
					
						
							
							
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							8fce187de4 
							
						 
					 
					
						
						
							
							Fix ArcEager for missing values  
						
						
						
					 
					
						2017-08-01 22:10:05 +02:00 
						 
				 
			
				
					
						
							
							
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							27abc56e98 
							
						 
					 
					
						
						
							
							Add method to get beam entities  
						
						
						
					 
					
						2017-07-29 21:59:02 +02:00 
						 
				 
			
				
					
						
							
							
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							c86445bdfd 
							
						 
					 
					
						
						
							
							Merge branch 'develop' of  https://github.com/explosion/spaCy  into develop  
						
						
						
					 
					
						2017-07-22 01:14:28 +02:00 
						 
				 
			
				
					
						
							
							
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							3da1063b36 
							
						 
					 
					
						
						
							
							Add beam decoding to parser, to allow NER uncertainties  
						
						
						
					 
					
						2017-07-20 15:02:55 +02:00 
						 
				 
			
				
					
						
							
							
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							0ca5832427 
							
						 
					 
					
						
						
							
							Improve negative example handling in NER oracle  
						
						
						
					 
					
						2017-07-20 00:18:49 +02:00 
						 
				 
			
				
					
						
							
							
								Tpt 
							
						 
					 
					
						
						
						
						
							
						
						
							57e8254f63 
							
						 
					 
					
						
						
							
							Adds function to extract french noun chunks  
						
						
						
					 
					
						2017-06-12 15:20:49 +02:00 
						 
				 
			
				
					
						
							
							
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							6d0356e6cc 
							
						 
					 
					
						
						
							
							Whitespace  
						
						
						
					 
					
						2017-06-04 14:55:24 -05:00 
						 
				 
			
				
					
						
							
							
								ines 
							
						 
					 
					
						
						
						
						
							
						
						
							6669583f4e 
							
						 
					 
					
						
						
							
							Use OrderedDict  
						
						
						
					 
					
						2017-06-02 21:07:56 +02:00 
						 
				 
			
				
					
						
							
							
								ines 
							
						 
					 
					
						
						
						
						
							
						
						
							2f1025a94c 
							
						 
					 
					
						
						
							
							Port over Spanish changes from  #1096  
						
						
						
					 
					
						2017-06-02 19:09:58 +02:00 
						 
				 
			
				
					
						
							
							
								ines 
							
						 
					 
					
						
						
						
						
							
						
						
							fdd0923be4 
							
						 
					 
					
						
						
							
							Translate model=True in exclude to lower_model and upper_model  
						
						
						
					 
					
						2017-06-02 18:37:07 +02:00 
						 
				 
			
				
					
						
							
							
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							4c97371051 
							
						 
					 
					
						
						
							
							Fixes for thinc 6.7  
						
						
						
					 
					
						2017-06-01 04:22:16 -05:00 
						 
				 
			
				
					
						
							
							
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							ae8010b526 
							
						 
					 
					
						
						
							
							Move weight serialization to Thinc  
						
						
						
					 
					
						2017-06-01 02:56:12 -05:00 
						 
				 
			
				
					
						
							
							
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							097ab9c6e4 
							
						 
					 
					
						
						
							
							Fix transition system to/from disk  
						
						
						
					 
					
						2017-05-31 13:44:00 +02:00 
						 
				 
			
				
					
						
							
							
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							33e5ec737f 
							
						 
					 
					
						
						
							
							Fix to/from disk methods  
						
						
						
					 
					
						2017-05-31 13:43:10 +02:00 
						 
				 
			
				
					
						
							
							
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							53a3824334 
							
						 
					 
					
						
						
							
							Fix mistake in ner feature  
						
						
						
					 
					
						2017-05-31 03:01:02 +02:00 
						 
				 
			
				
					
						
							
							
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							cc911feab2 
							
						 
					 
					
						
						
							
							Fix bug in NER state  
						
						
						
					 
					
						2017-05-30 22:12:19 +02:00 
						 
				 
			
				
					
						
							
							
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							be4a640f0c 
							
						 
					 
					
						
						
							
							Fix arc eager label costs for uint64  
						
						
						
					 
					
						2017-05-30 20:37:58 +02:00 
						 
				 
			
				
					
						
							
							
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							aa4c33914b 
							
						 
					 
					
						
						
							
							Work on serialization  
						
						
						
					 
					
						2017-05-29 08:40:45 -05:00 
						 
				 
			
				
					
						
							
							
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							59f355d525 
							
						 
					 
					
						
						
							
							Fixes for serialization  
						
						
						
					 
					
						2017-05-29 13:38:20 +02:00 
						 
				 
			
				
					
						
							
							
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							ff26aa6c37 
							
						 
					 
					
						
						
							
							Work on to/from bytes/disk serialization methods  
						
						
						
					 
					
						2017-05-29 11:45:45 +02:00 
						 
				 
			
				
					
						
							
							
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							6b019b0540 
							
						 
					 
					
						
						
							
							Update to/from bytes methods  
						
						
						
					 
					
						2017-05-29 10:14:20 +02:00 
						 
				 
			
				
					
						
							
							
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							9239f06ed3 
							
						 
					 
					
						
						
							
							Fix german noun chunks iterator  
						
						
						
					 
					
						2017-05-28 20:13:03 +02:00 
						 
				 
			
				
					
						
							
							
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							fd9b6722a9 
							
						 
					 
					
						
						
							
							Fix noun chunks iterator for new stringstore  
						
						
						
					 
					
						2017-05-28 20:12:10 +02:00 
						 
				 
			
				
					
						
							
							
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							7996d21717 
							
						 
					 
					
						
						
							
							Fixes for new StringStore  
						
						
						
					 
					
						2017-05-28 11:09:27 -05:00 
						 
				 
			
				
					
						
							
							
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							8a24c60c1e 
							
						 
					 
					
						
						
							
							Merge branch 'develop' of  https://github.com/explosion/spaCy  into develop  
						
						
						
					 
					
						2017-05-28 08:12:05 -05:00 
						 
				 
			
				
					
						
							
							
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							bc97bc292c 
							
						 
					 
					
						
						
							
							Fix __call__ method  
						
						
						
					 
					
						2017-05-28 08:11:58 -05:00 
						 
				 
			
				
					
						
							
							
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							84e66ca6d4 
							
						 
					 
					
						
						
							
							WIP on stringstore change. 27 failures  
						
						
						
					 
					
						2017-05-28 14:06:40 +02:00 
						 
				 
			
				
					
						
							
							
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							39293ab2ee 
							
						 
					 
					
						
						
							
							Merge branch 'develop' of  https://github.com/explosion/spaCy  into develop  
						
						
						
					 
					
						2017-05-28 11:46:57 +02:00 
						 
				 
			
				
					
						
							
							
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							dd052572d4 
							
						 
					 
					
						
						
							
							Update arc eager for SBD changes  
						
						
						
					 
					
						2017-05-28 11:46:51 +02:00 
						 
				 
			
				
					
						
							
							
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							c1263a844b 
							
						 
					 
					
						
						
							
							Merge branch 'develop' of  https://github.com/explosion/spaCy  into develop  
						
						
						
					 
					
						2017-05-27 18:32:57 -05:00 
						 
				 
			
				
					
						
							
							
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							9e711c3476 
							
						 
					 
					
						
						
							
							Divide d_loss by batch size  
						
						
						
					 
					
						2017-05-27 18:32:46 -05:00 
						 
				 
			
				
					
						
							
							
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							a1d4c97fb7 
							
						 
					 
					
						
						
							
							Improve correctness of minibatching  
						
						
						
					 
					
						2017-05-27 17:59:00 -05:00 
						 
				 
			
				
					
						
							
							
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							49235017bf 
							
						 
					 
					
						
						
							
							Merge branch 'develop' of  https://github.com/explosion/spaCy  into develop  
						
						
						
					 
					
						2017-05-27 16:34:28 -05:00 
						 
				 
			
				
					
						
							
							
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							7ebd26b8aa 
							
						 
					 
					
						
						
							
							Use ordered dict to specify transitions  
						
						
						
					 
					
						2017-05-27 15:52:20 -05:00 
						 
				 
			
				
					
						
							
							
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							3eea5383a1 
							
						 
					 
					
						
						
							
							Add move_names property to parser  
						
						
						
					 
					
						2017-05-27 15:51:55 -05:00 
						 
				 
			
				
					
						
							
							
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							99316fa631 
							
						 
					 
					
						
						
							
							Use ordered dict to specify actions  
						
						
						
					 
					
						2017-05-27 15:50:21 -05:00 
						 
				 
			
				
					
						
							
							
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							655ca58c16 
							
						 
					 
					
						
						
							
							Clarifying change to StateC.clone  
						
						
						
					 
					
						2017-05-27 15:49:37 -05:00 
						 
				 
			
				
					
						
							
							
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							3d22fcaf0b 
							
						 
					 
					
						
						
							
							Return None from parser if there are no annotations  
						
						
						
					 
					
						2017-05-26 14:02:59 -05:00 
						 
				 
			
				
					
						
							
							
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							3d5a536eaa 
							
						 
					 
					
						
						
							
							Improve efficiency of parser batching  
						
						
						
					 
					
						2017-05-26 11:31:23 -05:00 
						 
				 
			
				
					
						
							
							
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							2cb7cc2db7 
							
						 
					 
					
						
						
							
							Remove commented code from parser  
						
						
						
					 
					
						2017-05-25 14:55:09 -05:00 
						 
				 
			
				
					
						
							
							
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							c245ff6b27 
							
						 
					 
					
						
						
							
							Rebatch parser inputs, with mid-sentence states  
						
						
						
					 
					
						2017-05-25 11:18:59 -05:00 
						 
				 
			
				
					
						
							
							
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							679efe79c8 
							
						 
					 
					
						
						
							
							Make parser update less hacky  
						
						
						
					 
					
						2017-05-25 06:49:00 -05:00 
						 
				 
			
				
					
						
							
							
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							e1cb5be0c7 
							
						 
					 
					
						
						
							
							Adjust dropout, depth and multi-task in parser  
						
						
						
					 
					
						2017-05-24 20:11:41 -05:00 
						 
				 
			
				
					
						
							
							
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							620df0414f 
							
						 
					 
					
						
						
							
							Fix dropout in parser  
						
						
						
					 
					
						2017-05-23 15:20:45 -05:00 
						 
				 
			
				
					
						
							
							
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							8026c183d0 
							
						 
					 
					
						
						
							
							Add hacky logic to accelerate depth=0 case in parser  
						
						
						
					 
					
						2017-05-23 11:06:49 -05:00 
						 
				 
			
				
					
						
							
							
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							a8b6d11c5b 
							
						 
					 
					
						
						
							
							Support optional maxout layer  
						
						
						
					 
					
						2017-05-23 05:58:07 -05:00 
						 
				 
			
				
					
						
							
							
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							c55b8fa7c5 
							
						 
					 
					
						
						
							
							Fix bugs in parse_batch  
						
						
						
					 
					
						2017-05-23 05:57:52 -05:00 
						 
				 
			
				
					
						
							
							
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							964707d795 
							
						 
					 
					
						
						
							
							Restore support for deeper networks in parser  
						
						
						
					 
					
						2017-05-23 05:31:13 -05:00 
						 
				 
			
				
					
						
							
							
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							6b918cc58e 
							
						 
					 
					
						
						
							
							Support making updates periodically during training  
						
						
						
					 
					
						2017-05-23 04:23:29 -05:00 
						 
				 
			
				
					
						
							
							
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							3f725ff7b3 
							
						 
					 
					
						
						
							
							Roll back changes to parser update  
						
						
						
					 
					
						2017-05-23 04:23:05 -05:00 
						 
				 
			
				
					
						
							
							
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							3959d778ac 
							
						 
					 
					
						
						
							
							Revert "Revert "WIP on improving parser efficiency""  
						
						... 
						
						
						
						This reverts commit 532afef4a8 
						
					 
					
						2017-05-23 03:06:53 -05:00 
						 
				 
			
				
					
						
							
							
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							532afef4a8 
							
						 
					 
					
						
						
							
							Revert "WIP on improving parser efficiency"  
						
						... 
						
						
						
						This reverts commit bdaac7ab44 
						
					 
					
						2017-05-23 03:05:25 -05:00 
						 
				 
			
				
					
						
							
							
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							bdaac7ab44 
							
						 
					 
					
						
						
							
							WIP on improving parser efficiency  
						
						
						
					 
					
						2017-05-23 02:59:31 -05:00 
						 
				 
			
				
					
						
							
							
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							8a9e318deb 
							
						 
					 
					
						
						
							
							Put the parsing loop in a nogil prange block  
						
						
						
					 
					
						2017-05-22 17:58:12 -05:00 
						 
				 
			
				
					
						
							
							
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							e2136232f9 
							
						 
					 
					
						
						
							
							Exclude states with no matching gold annotations from parsing  
						
						
						
					 
					
						2017-05-22 10:30:12 -05:00 
						 
				 
			
				
					
						
							
							
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							f00f821496 
							
						 
					 
					
						
						
							
							Fix pseudoprojectivity->nonproj  
						
						
						
					 
					
						2017-05-22 06:14:42 -05:00 
						 
				 
			
				
					
						
							
							
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							5d59e74cf6 
							
						 
					 
					
						
						
							
							PseudoProjectivity->nonproj  
						
						
						
					 
					
						2017-05-22 05:49:53 -05:00 
						 
				 
			
				
					
						
							
							
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							b45b4aa392 
							
						 
					 
					
						
						
							
							PseudoProjectivity --> nonproj  
						
						
						
					 
					
						2017-05-22 05:17:44 -05:00 
						 
				 
			
				
					
						
							
							
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							aae97f00e9 
							
						 
					 
					
						
						
							
							Fix nonproj import  
						
						
						
					 
					
						2017-05-22 05:15:06 -05:00 
						 
				 
			
				
					
						
							
							
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							2a5eb9f61e 
							
						 
					 
					
						
						
							
							Make nonproj methods top-level functions, instead of class methods  
						
						
						
					 
					
						2017-05-22 04:51:08 -05:00 
						 
				 
			
				
					
						
							
							
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							33e2222839 
							
						 
					 
					
						
						
							
							Remove unused code in deprojectivize  
						
						
						
					 
					
						2017-05-22 04:51:08 -05:00 
						 
				 
			
				
					
						
							
							
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							025d9bbc37 
							
						 
					 
					
						
						
							
							Fix handling of non-projective deps  
						
						
						
					 
					
						2017-05-22 04:51:08 -05:00 
						 
				 
			
				
					
						
							
							
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							1b5fa68996 
							
						 
					 
					
						
						
							
							Do pseudo-projective pre-processing for parser  
						
						
						
					 
					
						2017-05-22 04:51:08 -05:00 
						 
				 
			
				
					
						
							
							
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							1d5d9838a2 
							
						 
					 
					
						
						
							
							Fix action collection for parser  
						
						
						
					 
					
						2017-05-22 04:51:08 -05:00 
						 
				 
			
				
					
						
							
							
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							3b7c108246 
							
						 
					 
					
						
						
							
							Pass tokvecs through as a list, instead of concatenated. Also fix padding  
						
						
						
					 
					
						2017-05-20 13:23:32 -05:00 
						 
				 
			
				
					
						
							
							
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							d52b65aec2 
							
						 
					 
					
						
						
							
							Revert "Move to contiguous buffer for token_ids and d_vectors"  
						
						... 
						
						
						
						This reverts commit 3ff8c35a79 
						
					 
					
						2017-05-20 11:26:23 -05:00 
						 
				 
			
				
					
						
							
							
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							b272890a8c 
							
						 
					 
					
						
						
							
							Try to move parser to simpler PrecomputedAffine class. Currently broken -- maybe the previous change  
						
						
						
					 
					
						2017-05-20 06:40:10 -05:00 
						 
				 
			
				
					
						
							
							
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							3ff8c35a79 
							
						 
					 
					
						
						
							
							Move to contiguous buffer for token_ids and d_vectors  
						
						
						
					 
					
						2017-05-20 04:17:30 -05:00 
						 
				 
			
				
					
						
							
							
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							8b04b0af9f 
							
						 
					 
					
						
						
							
							Remove freqs from transition_system  
						
						
						
					 
					
						2017-05-20 02:20:48 -05:00 
						 
				 
			
				
					
						
							
							
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							a1ba20e2b1 
							
						 
					 
					
						
						
							
							Fix over-run on parse_batch  
						
						
						
					 
					
						2017-05-19 18:57:30 -05:00 
						 
				 
			
				
					
						
							
							
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							e84de028b5 
							
						 
					 
					
						
						
							
							Remove 'rebatch' op, and remove min-batch cap  
						
						
						
					 
					
						2017-05-19 18:16:36 -05:00 
						 
				 
			
				
					
						
							
							
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							c12ab47a56 
							
						 
					 
					
						
						
							
							Remove state argument in pipeline. Other changes  
						
						
						
					 
					
						2017-05-19 13:26:36 -05:00 
						 
				 
			
				
					
						
							
							
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							c2c825127a 
							
						 
					 
					
						
						
							
							Fix use_params and pipe methods  
						
						
						
					 
					
						2017-05-18 08:30:59 -05:00 
						 
				 
			
				
					
						
							
							
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							fc8d3a112c 
							
						 
					 
					
						
						
							
							Add util.env_opt support: Can set hyper params through environment variables.  
						
						
						
					 
					
						2017-05-18 04:36:53 -05:00 
						 
				 
			
				
					
						
							
							
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							d2626fdb45 
							
						 
					 
					
						
						
							
							Fix name error in nn parser  
						
						
						
					 
					
						2017-05-18 04:31:01 -05:00 
						 
				 
			
				
					
						
							
							
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							793430aa7a 
							
						 
					 
					
						
						
							
							Get spaCy train command working with neural network  
						
						... 
						
						
						
						* Integrate models into pipeline
* Add basic serialization (maybe incorrect)
* Fix pickle on vocab 
						
					 
					
						2017-05-17 12:04:50 +02:00 
						 
				 
			
				
					
						
							
							
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							8cf097ca88 
							
						 
					 
					
						
						
							
							Redesign training to integrate NN components  
						
						... 
						
						
						
						* Obsolete .parser, .entity etc names in favour of .pipeline
* Components no longer create models on initialization
* Models created by loading method (from_disk(), from_bytes() etc), or
    .begin_training()
* Add .predict(), .set_annotations() methods in components
* Pass state through pipeline, to allow components to share information
    more flexibly. 
						
					 
					
						2017-05-16 16:17:30 +02:00 
						 
				 
			
				
					
						
							
							
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							5211645af3 
							
						 
					 
					
						
						
							
							Get data flowing through pipeline. Needs redesign  
						
						
						
					 
					
						2017-05-16 11:21:59 +02:00 
						 
				 
			
				
					
						
							
							
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							a9edb3aa1d 
							
						 
					 
					
						
						
							
							Improve integration of NN parser, to support unified training API  
						
						
						
					 
					
						2017-05-15 21:53:27 +02:00 
						 
				 
			
				
					
						
							
							
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							4b9d69f428 
							
						 
					 
					
						
						
							
							Merge branch 'v2' into develop  
						
						... 
						
						
						
						* Move v2 parser into nn_parser.pyx
* New TokenVectorEncoder class in pipeline.pyx
* New spacy/_ml.py module
Currently the two parsers live side-by-side, until we figure out how to
organize them. 
						
					 
					
						2017-05-14 01:10:23 +02:00 
						 
				 
			
				
					
						
							
							
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							5cac951a16 
							
						 
					 
					
						
						
							
							Move new parser to nn_parser.pyx, and restore old parser, to make tests pass.  
						
						
						
					 
					
						2017-05-14 00:55:01 +02:00 
						 
				 
			
				
					
						
							
							
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							f8c02b4341 
							
						 
					 
					
						
						
							
							Remove cupy imports from parser, so it can work on CPU  
						
						
						
					 
					
						2017-05-14 00:37:53 +02:00 
						 
				 
			
				
					
						
							
							
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							e6d71e1778 
							
						 
					 
					
						
						
							
							Small fixes to parser  
						
						
						
					 
					
						2017-05-13 17:19:04 -05:00 
						 
				 
			
				
					
						
							
							
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							188c0f6949 
							
						 
					 
					
						
						
							
							Clean up unused import  
						
						
						
					 
					
						2017-05-13 17:18:27 -05:00 
						 
				 
			
				
					
						
							
							
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							f85c8464f7 
							
						 
					 
					
						
						
							
							Draft support of regression loss in parser  
						
						
						
					 
					
						2017-05-13 17:17:27 -05:00 
						 
				 
			
				
					
						
							
							
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							827b5af697 
							
						 
					 
					
						
						
							
							Update draft of parser neural network model  
						
						... 
						
						
						
						Model is good, but code is messy. Currently requires Chainer, which may cause the build to fail on machines without a GPU.
Outline of the model:
We first predict context-sensitive vectors for each word in the input:
(embed_lower | embed_prefix | embed_suffix | embed_shape)
>> Maxout(token_width)
>> convolution ** 4
This convolutional layer is shared between the tagger and the parser. This prevents the parser from needing tag features.
To boost the representation, we make a "super tag" with POS, morphology and dependency label. The tagger predicts this
by adding a softmax layer onto the convolutional layer --- so, we're teaching the convolutional layer to give us a
representation that's one affine transform from this informative lexical information. This is obviously good for the
parser (which backprops to the convolutions too).
The parser model makes a state vector by concatenating the vector representations for its context tokens. Current
results suggest few context tokens works well. Maybe this is a bug.
The current context tokens:
* S0, S1, S2: Top three words on the stack
* B0, B1: First two words of the buffer
* S0L1, S0L2: Leftmost and second leftmost children of S0
* S0R1, S0R2: Rightmost and second rightmost children of S0
* S1L1, S1L2, S1R2, S1R, B0L1, B0L2: Likewise for S1 and B0
This makes the state vector quite long: 13*T, where T is the token vector width (128 is working well). Fortunately,
there's a way to structure the computation to save some expense (and make it more GPU friendly).
The parser typically visits 2*N states for a sentence of length N (although it may visit more, if it back-tracks
with a non-monotonic transition). A naive implementation would require 2*N (B, 13*T) @ (13*T, H) matrix multiplications
for a batch of size B. We can instead perform one (B*N, T) @ (T, 13*H) multiplication, to pre-compute the hidden
weights for each positional feature wrt the words in the batch. (Note that our token vectors come from the CNN
-- so we can't play this trick over the vocabulary. That's how Stanford's NN parser works --- and why its model
is so big.)
This pre-computation strategy allows a nice compromise between GPU-friendliness and implementation simplicity.
The CNN and the wide lower layer are computed on the GPU, and then the precomputed hidden weights are moved
to the CPU, before we start the transition-based parsing process. This makes a lot of things much easier.
We don't have to worry about variable-length batch sizes, and we don't have to implement the dynamic oracle
in CUDA to train.
Currently the parser's loss function is multilabel log loss, as the dynamic oracle allows multiple states to
be 0 cost. This is defined as:
(exp(score) / Z) - (exp(score) / gZ)
Where gZ is the sum of the scores assigned to gold classes. I'm very interested in regressing on the cost directly,
but so far this isn't working well.
Machinery is in place for beam-search, which has been working well for the linear model. Beam search should benefit
greatly from the pre-computation trick. 
						
					 
					
						2017-05-12 16:09:15 -05:00 
						 
				 
			
				
					
						
							
							
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							b44f7e259c 
							
						 
					 
					
						
						
							
							Clean up unused parser code  
						
						
						
					 
					
						2017-05-08 15:42:04 +02:00 
						 
				 
			
				
					
						
							
							
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							17efb1c001 
							
						 
					 
					
						
						
							
							Change width  
						
						
						
					 
					
						2017-05-08 08:40:13 -05:00 
						 
				 
			
				
					
						
							
							
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							bef89ef23d 
							
						 
					 
					
						
						
							
							Mergery  
						
						
						
					 
					
						2017-05-08 08:29:36 -05:00 
						 
				 
			
				
					
						
							
							
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							50ddc9fc45 
							
						 
					 
					
						
						
							
							Fix infinite loop bug  
						
						
						
					 
					
						2017-05-08 07:54:26 -05:00 
						 
				 
			
				
					
						
							
							
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							a66a4a4d0f 
							
						 
					 
					
						
						
							
							Replace einsums  
						
						
						
					 
					
						2017-05-08 14:46:50 +02:00 
						 
				 
			
				
					
						
							
							
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							8d2eab74da 
							
						 
					 
					
						
						
							
							Use PretrainableMaxouts  
						
						
						
					 
					
						2017-05-08 14:24:55 +02:00 
						 
				 
			
				
					
						
							
							
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							2e2268a442 
							
						 
					 
					
						
						
							
							Precomputable hidden now working  
						
						
						
					 
					
						2017-05-08 11:36:37 +02:00 
						 
				 
			
				
					
						
							
							
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							10682d35ab 
							
						 
					 
					
						
						
							
							Get pre-computed version working  
						
						
						
					 
					
						2017-05-08 00:38:35 +02:00 
						 
				 
			
				
					
						
							
							
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							35458987e8 
							
						 
					 
					
						
						
							
							Checkpoint -- nearly finished reimpl  
						
						
						
					 
					
						2017-05-07 23:05:01 +02:00 
						 
				 
			
				
					
						
							
							
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							4441866f55 
							
						 
					 
					
						
						
							
							Checkpoint -- nearly finished reimpl  
						
						
						
					 
					
						2017-05-07 22:47:06 +02:00 
						 
				 
			
				
					
						
							
							
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							6782eedf9b 
							
						 
					 
					
						
						
							
							Tmp GPU code  
						
						
						
					 
					
						2017-05-07 11:04:24 -05:00 
						 
				 
			
				
					
						
							
							
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							e420e5a809 
							
						 
					 
					
						
						
							
							Tmp  
						
						
						
					 
					
						2017-05-07 07:31:09 -05:00 
						 
				 
			
				
					
						
							
							
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							700979fb3c 
							
						 
					 
					
						
						
							
							CPU/GPU compat  
						
						
						
					 
					
						2017-05-07 04:01:11 +02:00 
						 
				 
			
				
					
						
							
							
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							f99f5b75dc 
							
						 
					 
					
						
						
							
							working residual net  
						
						
						
					 
					
						2017-05-07 03:57:26 +02:00 
						 
				 
			
				
					
						
							
							
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							bdf2dba9fb 
							
						 
					 
					
						
						
							
							WIP on refactor, with hidde pre-computing  
						
						
						
					 
					
						2017-05-07 02:02:43 +02:00 
						 
				 
			
				
					
						
							
							
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							b439e04f8d 
							
						 
					 
					
						
						
							
							Learning smoothly  
						
						
						
					 
					
						2017-05-06 20:38:12 +02:00 
						 
				 
			
				
					
						
							
							
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							08bee76790 
							
						 
					 
					
						
						
							
							Learns things  
						
						
						
					 
					
						2017-05-06 18:24:38 +02:00 
						 
				 
			
				
					
						
							
							
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							bcf4cd0a5f 
							
						 
					 
					
						
						
							
							Learns things  
						
						
						
					 
					
						2017-05-06 17:37:36 +02:00 
						 
				 
			
				
					
						
							
							
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							8e48b58cd6 
							
						 
					 
					
						
						
							
							Gradients look correct  
						
						
						
					 
					
						2017-05-06 16:47:15 +02:00 
						 
				 
			
				
					
						
							
							
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							7e04260d38 
							
						 
					 
					
						
						
							
							Data running through, likely errors in model  
						
						
						
					 
					
						2017-05-06 14:22:20 +02:00 
						 
				 
			
				
					
						
							
							
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							ef4fa594aa 
							
						 
					 
					
						
						
							
							Draft of NN parser, to be tested  
						
						
						
					 
					
						2017-05-05 19:20:39 +02:00 
						 
				 
			
				
					
						
							
							
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							ccaf26206b 
							
						 
					 
					
						
						
							
							Pseudocode for parser  
						
						
						
					 
					
						2017-05-04 12:17:59 +02:00 
						 
				 
			
				
					
						
							
							
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							2da16adcc2 
							
						 
					 
					
						
						
							
							Add dropout optin for parser and NER  
						
						... 
						
						
						
						Dropout can now be specified in the `Parser.update()` method via
the `drop` keyword argument, e.g.
    nlp.entity.update(doc, gold, drop=0.4)
This will randomly drop 40% of features, and multiply the value of the
others by 1. / 0.4. This may be useful for generalising from small data
sets.
This commit also patches the examples/training/train_new_entity_type.py
example, to use dropout and fix the output (previously it did not output
the learned entity). 
						
					 
					
						2017-04-27 13:18:39 +02:00 
						 
				 
			
				
					
						
							
							
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							d2436dc17b 
							
						 
					 
					
						
						
							
							Update fix for Issue  #999  
						
						
						
					 
					
						2017-04-23 18:14:37 +02:00 
						 
				 
			
				
					
						
							
							
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							60703cede5 
							
						 
					 
					
						
						
							
							Ensure noun chunks can't be nested.  Closes   #955  
						
						
						
					 
					
						2017-04-23 17:56:39 +02:00 
						 
				 
			
				
					
						
							
							
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							4eef200bab 
							
						 
					 
					
						
						
							
							Persist the actions within spacy.parser.cfg  
						
						
						
					 
					
						2017-04-20 17:02:44 +02:00 
						 
				 
			
				
					
						
							
							
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							137b210bcf 
							
						 
					 
					
						
						
							
							Restore use of FTRL training  
						
						
						
					 
					
						2017-04-16 18:02:42 +02:00 
						 
				 
			
				
					
						
							
							
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							45464d065e 
							
						 
					 
					
						
						
							
							Remove print statement  
						
						
						
					 
					
						2017-04-15 16:11:43 +02:00 
						 
				 
			
				
					
						
							
							
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							c76cb8af35 
							
						 
					 
					
						
						
							
							Fix training for new labels  
						
						
						
					 
					
						2017-04-15 16:11:26 +02:00 
						 
				 
			
				
					
						
							
							
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							4884b2c113 
							
						 
					 
					
						
						
							
							Refix StepwiseState  
						
						
						
					 
					
						2017-04-15 16:00:28 +02:00 
						 
				 
			
				
					
						
							
							
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							1a98e48b8e 
							
						 
					 
					
						
						
							
							Fix Stepwisestate'  
						
						
						
					 
					
						2017-04-15 13:35:01 +02:00 
						 
				 
			
				
					
						
							
							
								ines 
							
						 
					 
					
						
						
						
						
							
						
						
							0739ae7b76 
							
						 
					 
					
						
						
							
							Tidy up and fix formatting and imports  
						
						
						
					 
					
						2017-04-15 13:05:15 +02:00 
						 
				 
			
				
					
						
							
							
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							354458484c 
							
						 
					 
					
						
						
							
							WIP on add_label bug during NER training  
						
						... 
						
						
						
						Currently when a new label is introduced to NER during training,
it causes the labels to be read in in an unexpected order. This
invalidates the model. 
						
					 
					
						2017-04-14 23:52:17 +02:00 
						 
				 
			
				
					
						
							
							
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							49e2de900e 
							
						 
					 
					
						
						
							
							Add costs property to StepwiseState, to show which moves are gold.  
						
						
						
					 
					
						2017-04-10 11:37:04 +02:00 
						 
				 
			
				
					
						
							
							
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							cc36c308f4 
							
						 
					 
					
						
						
							
							Fix noun_chunk rules around coordination  
						
						... 
						
						
						
						Closes  #693 . 
					
						2017-04-07 17:06:40 +02:00 
						 
				 
			
				
					
						
							
							
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							1bb7b4ca71 
							
						 
					 
					
						
						
							
							Add comment  
						
						
						
					 
					
						2017-03-31 13:59:19 +02:00 
						 
				 
			
				
					
						
							
							
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							47a3ef06a6 
							
						 
					 
					
						
						
							
							Unhack deprojetivization, moving it into pipeline  
						
						... 
						
						
						
						Previously the deprojectivize() call was attached to the transition
system, and only called for German. Instead it should be a separate
process, called after the parser. This makes it available for any
language. Closes  #898 . 
						
					 
					
						2017-03-31 12:31:50 +02:00 
						 
				 
			
				
					
						
							
							
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							a9b1f23c7d 
							
						 
					 
					
						
						
							
							Enable regression loss for parser  
						
						
						
					 
					
						2017-03-26 09:26:30 -05:00 
						 
				 
			
				
					
						
							
							
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							b487b8735a 
							
						 
					 
					
						
						
							
							Decrease beam density, and fix Python 3 problem in beam  
						
						
						
					 
					
						2017-03-20 12:56:05 +01:00 
						 
				 
			
				
					
						
							
							
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							c90dc7ac29 
							
						 
					 
					
						
						
							
							Clean up state initiatisation in transition system  
						
						
						
					 
					
						2017-03-16 11:59:11 -05:00 
						 
				 
			
				
					
						
							
							
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							a46933a8fe 
							
						 
					 
					
						
						
							
							Clean up FTRL parsing stuff.  
						
						
						
					 
					
						2017-03-16 11:58:20 -05:00 
						 
				 
			
				
					
						
							
							
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							2611ac2a89 
							
						 
					 
					
						
						
							
							Fix scorer bug for NER, related to ambiguity between missing annotations and misaligned tokens  
						
						
						
					 
					
						2017-03-16 09:38:28 -05:00 
						 
				 
			
				
					
						
							
							
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							3d0833c3df 
							
						 
					 
					
						
						
							
							Fix off-by-1 in parse features fill_context  
						
						
						
					 
					
						2017-03-15 19:55:35 -05:00 
						 
				 
			
				
					
						
							
							
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							4ef68c413f 
							
						 
					 
					
						
						
							
							Approximate cost in Break transition, to speed things up a bit.  
						
						
						
					 
					
						2017-03-15 16:40:27 -05:00 
						 
				 
			
				
					
						
							
							
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							8543db8a5b 
							
						 
					 
					
						
						
							
							Use ftrl optimizer in parser  
						
						
						
					 
					
						2017-03-15 11:56:37 -05:00 
						 
				 
			
				
					
						
							
							
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							d719f8e77e 
							
						 
					 
					
						
						
							
							Use nogil in parser, and set L1 to 0.0 by default  
						
						
						
					 
					
						2017-03-15 09:31:01 -05:00 
						 
				 
			
				
					
						
							
							
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							c61c501406 
							
						 
					 
					
						
						
							
							Update beam-parser to allow parser to maintain nogil  
						
						
						
					 
					
						2017-03-15 09:30:22 -05:00 
						 
				 
			
				
					
						
							
							
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							c79b3129e3 
							
						 
					 
					
						
						
							
							Fix setting of empty lexeme in initial parse state  
						
						
						
					 
					
						2017-03-15 09:26:53 -05:00 
						 
				 
			
				
					
						
							
							
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							6c4108c073 
							
						 
					 
					
						
						
							
							Add header for beam parser  
						
						
						
					 
					
						2017-03-11 12:45:12 -06:00 
						 
				 
			
				
					
						
							
							
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							931feb3360 
							
						 
					 
					
						
						
							
							Allow beam parsing for NER  
						
						
						
					 
					
						2017-03-11 11:12:01 -06:00 
						 
				 
			
				
					
						
							
							
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							ca9c8c57c0 
							
						 
					 
					
						
						
							
							Add iteration argument to parser.update  
						
						
						
					 
					
						2017-03-11 07:00:47 -06:00 
						 
				 
			
				
					
						
							
							
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							d59c6926c1 
							
						 
					 
					
						
						
							
							I think this fixes the segfault  
						
						
						
					 
					
						2017-03-11 06:58:34 -06:00 
						 
				 
			
				
					
						
							
							
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							318b9e32ff 
							
						 
					 
					
						
						
							
							WIP on beam parser. Currently segfaults.  
						
						
						
					 
					
						2017-03-11 06:19:52 -06:00 
						 
				 
			
				
					
						
							
							
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							b0d80dc9ae 
							
						 
					 
					
						
						
							
							Update name of 'train' function in BeamParser  
						
						
						
					 
					
						2017-03-10 14:35:43 -06:00 
						 
				 
			
				
					
						
							
							
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							d11f1a4ddf 
							
						 
					 
					
						
						
							
							Record negative costs in non-monotonic arc eager oracle  
						
						
						
					 
					
						2017-03-10 11:22:04 -06:00 
						 
				 
			
				
					
						
							
							
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							ecf91a2dbb 
							
						 
					 
					
						
						
							
							Support beam parser  
						
						
						
					 
					
						2017-03-10 11:21:21 -06:00 
						 
				 
			
				
					
						
							
							
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							c62da02344 
							
						 
					 
					
						
						
							
							Use ftrl training, to learn compressed model.  
						
						
						
					 
					
						2017-03-09 18:43:21 -06:00 
						 
				 
			
				
					
						
							
							
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							40703988bc 
							
						 
					 
					
						
						
							
							Use FTRL training in parser  
						
						
						
					 
					
						2017-03-08 01:38:51 +01:00 
						 
				 
			
				
					
						
							
							
								Roman Inflianskas 
							
						 
					 
					
						
						
						
						
							
						
						
							66e1109b53 
							
						 
					 
					
						
						
							
							Add support for Universal Dependencies v2.0  
						
						
						
					 
					
						2017-03-03 13:17:34 +01:00 
						 
				 
			
				
					
						
							
							
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							97a1286129 
							
						 
					 
					
						
						
							
							Revert changes to tagger and parser for thinc 6  
						
						
						
					 
					
						2017-01-09 10:08:34 -06:00 
						 
				 
			
				
					
						
							
							
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							af81ac8bb0 
							
						 
					 
					
						
						
							
							Use thinc 6.0  
						
						
						
					 
					
						2016-12-29 11:58:42 +01:00 
						 
				 
			
				
					
						
							
							
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							bc0a202c9c 
							
						 
					 
					
						
						
							
							Fix unicode problem in nonproj module  
						
						
						
					 
					
						2016-11-25 17:29:17 -06:00 
						 
				 
			
				
					
						
							
							
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							159e8c46e1 
							
						 
					 
					
						
						
							
							Merge old training fixes with newer state  
						
						
						
					 
					
						2016-11-25 09:16:36 -06:00 
						 
				 
			
				
					
						
							
							
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							39341598bb 
							
						 
					 
					
						
						
							
							Fix NER label calculation  
						
						
						
					 
					
						2016-11-25 09:02:22 -06:00 
						 
				 
			
				
					
						
							
							
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							ca773a1f53 
							
						 
					 
					
						
						
							
							Tweak arc_eager n_gold to deal with negative costs, and improve error message.  
						
						
						
					 
					
						2016-11-25 09:01:52 -06:00 
						 
				 
			
				
					
						
							
							
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							608d8f5421 
							
						 
					 
					
						
						
							
							Pass cfg through parser, and have is_valid default to 1, not 0 when resetting state  
						
						
						
					 
					
						2016-11-25 09:00:21 -06:00 
						 
				 
			
				
					
						
							
							
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							b8c4f5ea76 
							
						 
					 
					
						
						
							
							Allow German noun chunks to work on Span  
						
						... 
						
						
						
						Update the German noun chunks iterator, so that it also works on Span objects. 
						
					 
					
						2016-11-24 23:30:15 +11:00 
						 
				 
			
				
					
						
							
							
								Pokey Rule 
							
						 
					 
					
						
						
						
						
							
						
						
							3e3bda142d 
							
						 
					 
					
						
						
							
							Add noun_chunks to Span  
						
						
						
					 
					
						2016-11-24 10:47:20 +00:00 
						 
				 
			
				
					
						
							
							
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							b86f8af0c1 
							
						 
					 
					
						
						
							
							Fix doc strings  
						
						
						
					 
					
						2016-11-01 12:25:36 +01:00 
						 
				 
			
				
					
						
							
							
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							708ea22208 
							
						 
					 
					
						
						
							
							Infer types in transition_system.pyx  
						
						
						
					 
					
						2016-10-27 18:08:13 +02:00 
						 
				 
			
				
					
						
							
							
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							301f3cc898 
							
						 
					 
					
						
						
							
							Fix Issue  #429 . Add an initialize_state method to the named entity recogniser that adds missing entity types. This is a messy place to add this, because it's strange to have the method mutate state. A better home for this logic could be found.  
						
						
						
					 
					
						2016-10-27 18:01:55 +02:00 
						 
				 
			
				
					
						
							
							
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							03a520ec4f 
							
						 
					 
					
						
						
							
							Change signature of Parser.parseC, so that nr_class is read from the transition system. This allows the transition system to modify the number of actions in initialize_state.  
						
						
						
					 
					
						2016-10-27 17:58:56 +02:00 
						 
				 
			
				
					
						
							
							
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							a209b10579 
							
						 
					 
					
						
						
							
							Improve error message when oracle fails for non-projective trees, re Issue  #571 .  
						
						
						
					 
					
						2016-10-24 20:31:30 +02:00 
						 
				 
			
				
					
						
							
							
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							3e688e6d4b 
							
						 
					 
					
						
						
							
							Fix issue  #514  -- serializer fails when new entity type has been added. The fix here is quite ugly. It's best to add the entities ASAP after loading the NLP pipeline, to mitigate the brittleness.  
						
						
						
					 
					
						2016-10-23 17:45:44 +02:00 
						 
				 
			
				
					
						
							
							
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							59038f7efa 
							
						 
					 
					
						
						
							
							Restore support for prior data format -- specifically, the labels field of the config.  
						
						
						
					 
					
						2016-10-17 00:53:26 +02:00 
						 
				 
			
				
					
						
							
							
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							7887ab3b36 
							
						 
					 
					
						
						
							
							Fix default use of feature_templates in parser  
						
						
						
					 
					
						2016-10-16 21:41:56 +02:00 
						 
				 
			
				
					
						
							
							
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							f787cd29fe 
							
						 
					 
					
						
						
							
							Refactor the pipeline classes to make them more consistent, and remove the redundant blank() constructor.  
						
						
						
					 
					
						2016-10-16 21:34:57 +02:00 
						 
				 
			
				
					
						
							
							
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							274a4d4272 
							
						 
					 
					
						
						
							
							Fix queue Python property in StateClass  
						
						
						
					 
					
						2016-10-16 17:04:41 +02:00 
						 
				 
			
				
					
						
							
							
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							e8c8aa08ce 
							
						 
					 
					
						
						
							
							Make action_name optional in StepwiseState  
						
						
						
					 
					
						2016-10-16 17:04:16 +02:00 
						 
				 
			
				
					
						
							
							
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							4fc56d4a31 
							
						 
					 
					
						
						
							
							Rename 'labels' to 'actions' in parser options  
						
						
						
					 
					
						2016-10-16 11:42:26 +02:00 
						 
				 
			
				
					
						
							
							
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							3259a63779 
							
						 
					 
					
						
						
							
							Whitespace  
						
						
						
					 
					
						2016-10-16 01:47:28 +02:00 
						 
				 
			
				
					
						
							
							
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							d9ae2d68af 
							
						 
					 
					
						
						
							
							Load features by string-name for backwards compatibility.  
						
						
						
					 
					
						2016-10-12 20:15:11 +02:00 
						 
				 
			
				
					
						
							
							
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							3a03c668c3 
							
						 
					 
					
						
						
							
							Fix message in ParserStateError  
						
						
						
					 
					
						2016-10-12 14:44:31 +02:00 
						 
				 
			
				
					
						
							
							
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							6bf505e865 
							
						 
					 
					
						
						
							
							Fix error on ParserStateError  
						
						
						
					 
					
						2016-10-12 14:35:55 +02:00 
						 
				 
			
				
					
						
							
							
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							ea23b64cc8 
							
						 
					 
					
						
						
							
							Refactor training, with new spacy.train module. Defaults still a little awkward.  
						
						
						
					 
					
						2016-10-09 12:24:24 +02:00