Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							0127f10ba3 
							
						 
					 
					
						
						
							
							Improve train tensorizer script  
						
						
						
					 
					
						2018-11-03 10:54:20 +00:00 
						 
				 
			
				
					
						
							
							
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							ba365ae1c9 
							
						 
					 
					
						
						
							
							Normalize gradient by number of words in tensorizer  
						
						
						
					 
					
						2018-11-03 10:53:22 +00:00 
						 
				 
			
				
					
						
							
							
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							dac3f1b280 
							
						 
					 
					
						
						
							
							Improve Tensorizer  
						
						
						
					 
					
						2018-11-03 10:52:50 +00:00 
						 
				 
			
				
					
						
							
							
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							baf7feae68 
							
						 
					 
					
						
						
							
							Add tensorizer training example  
						
						
						
					 
					
						2018-11-02 23:30:06 +00:00 
						 
				 
			
				
					
						
							
							
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							2527ba68e5 
							
						 
					 
					
						
						
							
							Fix tensorizer  
						
						
						
					 
					
						2018-11-02 23:29:54 +00:00 
						 
				 
			
				
					
						
							
							
								Suraj Rajan 
							
						 
					 
					
						
						
						
						
							
						
						
							0bf14082a4 
							
						 
					 
					
						
						
							
							Added more constucts for dependency tree matcher ( #2836 )  
						
						
						
					 
					
						2018-10-29 23:21:39 +01:00 
						 
				 
			
				
					
						
							
							
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							817e1fc5e5 
							
						 
					 
					
						
						
							
							Fix out-of-bounds access in NER training  
						
						... 
						
						
						
						The helper method state.B(1) gets the index of the first token of the
buffer, or -1 if no such token exists. Normally this is safe because we
pass this to functions like state.safe_get(), which returns an empty
token. Here we used it directly as an array index, which is not okay!
This error may have been the cause of out-of-bounds access errors during
training. Similar errors may still be around, so much be hunted down.
Hunting this one down took a long time...I printed out values across
training runs and diffed, looking for points of divergence between
runs, when no randomness should be allowed. 
						
					 
					
						2018-10-27 01:12:50 +02:00 
						 
				 
			
				
					
						
							
							
								Ines Montani 
							
						 
					 
					
						
						
						
						
							
						
						
							ea20b72c08 
							
						 
					 
					
						
						
							
							💫  Make like_num work for prefixed numbers ( #2808 )  
						
						... 
						
						
						
						* Only split + prefix if not numbers
* Make like_num work for prefixed numbers
* Add test for like_num 
						
					 
					
						2018-10-01 10:49:14 +02:00 
						 
				 
			
				
					
						
							
							
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							b39810d692 
							
						 
					 
					
						
						
							
							Fix copy_reg compatibility on _serialize module  
						
						
						
					 
					
						2018-09-28 15:23:14 +02:00 
						 
				 
			
				
					
						
							
							
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							f82f8ba5dd 
							
						 
					 
					
						
						
							
							Fix serialization when empty parser model.  Closes   #2482  
						
						
						
					 
					
						2018-09-28 15:18:52 +02:00 
						 
				 
			
				
					
						
							
							
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							d5a6c63b62 
							
						 
					 
					
						
						
							
							Add regression test for  #2482  
						
						
						
					 
					
						2018-09-28 15:18:30 +02:00 
						 
				 
			
				
					
						
							
							
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							e3e9fe18d4 
							
						 
					 
					
						
						
							
							Merge branch 'develop' of  https://github.com/explosion/spaCy  into develop  
						
						
						
					 
					
						2018-09-28 14:27:35 +02:00 
						 
				 
			
				
					
						
							
							
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							0323f5be0c 
							
						 
					 
					
						
						
							
							Fix _serialize module  
						
						
						
					 
					
						2018-09-28 14:27:24 +02:00 
						 
				 
			
				
					
						
							
							
								Ines Montani 
							
						 
					 
					
						
						
						
						
							
						
						
							5d56eb70d7 
							
						 
					 
					
						
						
							
							Tidy up tests  
						
						
						
					 
					
						2018-09-27 16:41:57 +02:00 
						 
				 
			
				
					
						
							
							
								Ines Montani 
							
						 
					 
					
						
						
						
						
							
						
						
							1f1bab9264 
							
						 
					 
					
						
						
							
							Remove unused import  
						
						
						
					 
					
						2018-09-27 16:41:37 +02:00 
						 
				 
			
				
					
						
							
							
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							b42c123e5d 
							
						 
					 
					
						
						
							
							Fix regression introduced by  1759abf1e 
						
						
						
					 
					
						2018-09-25 11:08:58 +02:00 
						 
				 
			
				
					
						
							
							
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							500898907b 
							
						 
					 
					
						
						
							
							Fix regression in parser.begin_training()  
						
						
						
					 
					
						2018-09-25 11:08:31 +02:00 
						 
				 
			
				
					
						
							
							
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							1759abf1e5 
							
						 
					 
					
						
						
							
							Fix bug in sentence starts for non-projective parses  
						
						... 
						
						
						
						The set_children_from_heads function assumed parse trees were
projective. However, non-projective parses may be passed in during
deserialization, or after deprojectivising. This caused incorrect
sentence boundaries to be set for non-projective parses. Close  #2772 . 
						
					 
					
						2018-09-19 14:50:06 +02:00 
						 
				 
			
				
					
						
							
							
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							48fd36bf05 
							
						 
					 
					
						
						
							
							Fix test for issue 27772  
						
						
						
					 
					
						2018-09-19 14:47:27 +02:00 
						 
				 
			
				
					
						
							
							
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							6cd920e088 
							
						 
					 
					
						
						
							
							Add xfail test for deprojectivization SBD bug  
						
						
						
					 
					
						2018-09-19 14:00:31 +02:00 
						 
				 
			
				
					
						
							
							
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							99a6011580 
							
						 
					 
					
						
						
							
							Avoid adding empty layer in model, to keep models backwards compatible  
						
						
						
					 
					
						2018-09-14 22:51:58 +02:00 
						 
				 
			
				
					
						
							
							
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							c046392317 
							
						 
					 
					
						
						
							
							Trigger on_data hooks in parser model  
						
						
						
					 
					
						2018-09-14 20:51:21 +02:00 
						 
				 
			
				
					
						
							
							
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							5afd98dff5 
							
						 
					 
					
						
						
							
							Add a stepping function, for changing batch sizes or learning rates  
						
						
						
					 
					
						2018-09-14 18:37:16 +02:00 
						 
				 
			
				
					
						
							
							
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							27c00f4f22 
							
						 
					 
					
						
						
							
							Merge branch 'develop' of  https://github.com/explosion/spaCy  into develop  
						
						
						
					 
					
						2018-09-14 12:30:57 +02:00 
						 
				 
			
				
					
						
							
							
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							f32b52e611 
							
						 
					 
					
						
						
							
							Fix bug that caused deprojectivisation to run multiple times  
						
						
						
					 
					
						2018-09-14 12:12:54 +02:00 
						 
				 
			
				
					
						
							
							
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							8f2a6367e9 
							
						 
					 
					
						
						
							
							Fix usage of PyTorch BiLSTM in ud_train  
						
						
						
					 
					
						2018-09-13 22:54:59 +00:00 
						 
				 
			
				
					
						
							
							
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							afeddfff26 
							
						 
					 
					
						
						
							
							Fix PyTorch BiLSTM  
						
						
						
					 
					
						2018-09-13 22:54:34 +00:00 
						 
				 
			
				
					
						
							
							
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							a26fe8e7bb 
							
						 
					 
					
						
						
							
							Small hack in Language.update to make torch work  
						
						
						
					 
					
						2018-09-13 22:51:52 +00:00 
						 
				 
			
				
					
						
							
							
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							445b81ce3f 
							
						 
					 
					
						
						
							
							Support bilstm_depth argument in ud-train  
						
						
						
					 
					
						2018-09-13 19:30:22 +02:00 
						 
				 
			
				
					
						
							
							
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							b43643a953 
							
						 
					 
					
						
						
							
							Support bilstm_depth option in parser  
						
						
						
					 
					
						2018-09-13 19:29:49 +02:00 
						 
				 
			
				
					
						
							
							
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							45032fe9e1 
							
						 
					 
					
						
						
							
							Support option of BiLSTM in Tok2Vec (requires pytorch)  
						
						
						
					 
					
						2018-09-13 19:28:35 +02:00 
						 
				 
			
				
					
						
							
							
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							3eb9f3e2b8 
							
						 
					 
					
						
						
							
							Fix defaults for ud-train  
						
						
						
					 
					
						2018-09-13 18:05:48 +02:00 
						 
				 
			
				
					
						
							
							
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							59cf533879 
							
						 
					 
					
						
						
							
							Improve ud-train script. Make config optional  
						
						
						
					 
					
						2018-09-13 14:24:08 +02:00 
						 
				 
			
				
					
						
							
							
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							3e3a309764 
							
						 
					 
					
						
						
							
							Fix tagger  
						
						
						
					 
					
						2018-09-13 14:14:38 +02:00 
						 
				 
			
				
					
						
							
							
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							da7650e84b 
							
						 
					 
					
						
						
							
							Fix maximum doc length in ud_train script  
						
						
						
					 
					
						2018-09-13 14:10:25 +02:00 
						 
				 
			
				
					
						
							
							
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							a95eea4c06 
							
						 
					 
					
						
						
							
							Fix multi-task objective for parser  
						
						
						
					 
					
						2018-09-13 14:08:55 +02:00 
						 
				 
			
				
					
						
							
							
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							21321cd6cf 
							
						 
					 
					
						
						
							
							Add tok2vec property to parser model  
						
						
						
					 
					
						2018-09-13 14:08:43 +02:00 
						 
				 
			
				
					
						
							
							
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							d6aa60139d 
							
						 
					 
					
						
						
							
							Fix tagger training on GPU  
						
						
						
					 
					
						2018-09-13 14:05:37 +02:00 
						 
				 
			
				
					
						
							
							
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							b2cb1fc67d 
							
						 
					 
					
						
						
							
							Merge matcher tests  
						
						
						
					 
					
						2018-09-06 01:39:53 +02:00 
						 
				 
			
				
					
						
							
							
								Suraj Krishnan Rajan 
							
						 
					 
					
						
						
						
						
							
						
						
							356af7b0a1 
							
						 
					 
					
						
						
							
							Fix tests  
						
						
						
					 
					
						2018-09-06 01:39:36 +02:00 
						 
				 
			
				
					
						
							
							
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							4d2d7d5866 
							
						 
					 
					
						
						
							
							Fix new feature flags  
						
						
						
					 
					
						2018-08-27 02:12:39 +02:00 
						 
				 
			
				
					
						
							
							
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							598dbf1ce0 
							
						 
					 
					
						
						
							
							Fix character-based tokenization for Japanese  
						
						
						
					 
					
						2018-08-27 01:51:38 +02:00 
						 
				 
			
				
					
						
							
							
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							3763e20afc 
							
						 
					 
					
						
						
							
							Pass subword_features and conv_depth params  
						
						
						
					 
					
						2018-08-27 01:51:15 +02:00 
						 
				 
			
				
					
						
							
							
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							8051136d70 
							
						 
					 
					
						
						
							
							Support subword_features and conv_depth params in Tok2Vec  
						
						
						
					 
					
						2018-08-27 01:50:48 +02:00 
						 
				 
			
				
					
						
							
							
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							9c33d4d1df 
							
						 
					 
					
						
						
							
							Add more hyper-parameters to spacy ud-train  
						
						... 
						
						
						
						* subword_features: Controls whether subword features are used in the
word embeddings. True by default (specifically, prefix, suffix and word
shape). Should be set to False for languages like Chinese and Japanese.
* conv_depth: Depth of the convolutional layers. Defaults to 4. 
						
					 
					
						2018-08-27 01:48:46 +02:00 
						 
				 
			
				
					
						
							
							
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							51a9efbf3b 
							
						 
					 
					
						
						
							
							Add draft Binder class  
						
						
						
					 
					
						2018-08-22 13:12:51 +02:00 
						 
				 
			
				
					
						
							
							
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							f0e6be689a 
							
						 
					 
					
						
						
							
							Merge branch 'develop' of  https://github.com/explosion/spaCy  into develop  
						
						
						
					 
					
						2018-08-16 17:18:19 +02:00 
						 
				 
			
				
					
						
							
							
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							5ce459d2ee 
							
						 
					 
					
						
						
							
							Fix error in vocab  
						
						
						
					 
					
						2018-08-16 17:18:09 +02:00 
						 
				 
			
				
					
						
							
							
								Ines Montani 
							
						 
					 
					
						
						
						
						
							
						
						
							aeb49eb625 
							
						 
					 
					
						
						
							
							Update version [ci skip]  
						
						
						
					 
					
						2018-08-16 16:56:02 +02:00 
						 
				 
			
				
					
						
							
							
								Ines Montani 
							
						 
					 
					
						
						
						
						
							
						
						
							a0eacd3293 
							
						 
					 
					
						
						
							
							Merge branch 'master' into develop  
						
						
						
					 
					
						2018-08-16 16:55:05 +02:00