Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							2874b8efd8 
							
						 
					 
					
						
						
							
							Fix tok2vec loading in spacy train  
						
						
						
					 
					
						2018-11-15 23:34:54 +00:00 
						 
				 
			
				
					
						
							
							
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							2ddd428834 
							
						 
					 
					
						
						
							
							Fix pretrain script  
						
						
						
					 
					
						2018-11-15 23:34:35 +00:00 
						 
				 
			
				
					
						
							
							
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							f8afaa0c1c 
							
						 
					 
					
						
						
							
							Fix pretrain  
						
						
						
					 
					
						2018-11-15 22:46:53 +00:00 
						 
				 
			
				
					
						
							
							
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							6af6950e46 
							
						 
					 
					
						
						
							
							Fix pretrain  
						
						
						
					 
					
						2018-11-15 22:45:36 +00:00 
						 
				 
			
				
					
						
							
							
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							3e7b214e57 
							
						 
					 
					
						
						
							
							Make pretrain script work with stream from stdin  
						
						
						
					 
					
						2018-11-15 22:44:07 +00:00 
						 
				 
			
				
					
						
							
							
								Matthew Honnibal 
							
						 
					 
					
						
						
							
							
						
						
						
							
						
						
							8fdb9bc278 
							
						 
					 
					
						
						
							
							💫  Add experimental ULMFit/BERT/Elmo-like pretraining  ( #2931 )  
						
						... 
						
						
						
						* Add 'spacy pretrain' command
* Fix pretrain command for Python 2
* Fix pretrain command
* Fix pretrain command 
						
					 
					
						2018-11-15 22:17:16 +01:00 
						 
				 
			
				
					
						
							
							
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							8f2a6367e9 
							
						 
					 
					
						
						
							
							Fix usage of PyTorch BiLSTM in ud_train  
						
						
						
					 
					
						2018-09-13 22:54:59 +00:00 
						 
				 
			
				
					
						
							
							
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							445b81ce3f 
							
						 
					 
					
						
						
							
							Support bilstm_depth argument in ud-train  
						
						
						
					 
					
						2018-09-13 19:30:22 +02:00 
						 
				 
			
				
					
						
							
							
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							3eb9f3e2b8 
							
						 
					 
					
						
						
							
							Fix defaults for ud-train  
						
						
						
					 
					
						2018-09-13 18:05:48 +02:00 
						 
				 
			
				
					
						
							
							
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							59cf533879 
							
						 
					 
					
						
						
							
							Improve ud-train script. Make config optional  
						
						
						
					 
					
						2018-09-13 14:24:08 +02:00 
						 
				 
			
				
					
						
							
							
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							da7650e84b 
							
						 
					 
					
						
						
							
							Fix maximum doc length in ud_train script  
						
						
						
					 
					
						2018-09-13 14:10:25 +02:00 
						 
				 
			
				
					
						
							
							
								Maxim Kupfer 
							
						 
					 
					
						
						
						
						
							
						
						
							cebe50b5b8 
							
						 
					 
					
						
						
							
							Remove ')' for clarity ( #2737 )  
						
						... 
						
						
						
						Sorry, don't mean to be nitpicky, I just noticed this when going through the CLI and thought it was a quick fix. That said, if this was intention than please let me know. 
						
					 
					
						2018-09-10 11:31:49 +02:00 
						 
				 
			
				
					
						
							
							
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							4d2d7d5866 
							
						 
					 
					
						
						
							
							Fix new feature flags  
						
						
						
					 
					
						2018-08-27 02:12:39 +02:00 
						 
				 
			
				
					
						
							
							
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							9c33d4d1df 
							
						 
					 
					
						
						
							
							Add more hyper-parameters to spacy ud-train  
						
						... 
						
						
						
						* subword_features: Controls whether subword features are used in the
word embeddings. True by default (specifically, prefix, suffix and word
shape). Should be set to False for languages like Chinese and Japanese.
* conv_depth: Depth of the convolutional layers. Defaults to 4. 
						
					 
					
						2018-08-27 01:48:46 +02:00 
						 
				 
			
				
					
						
							
							
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							595c893791 
							
						 
					 
					
						
						
							
							Expose noise_level option in train CLI  
						
						
						
					 
					
						2018-08-16 00:41:44 +02:00 
						 
				 
			
				
					
						
							
							
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							6ea981c839 
							
						 
					 
					
						
						
							
							Add converter for jsonl NER data  
						
						
						
					 
					
						2018-08-14 14:04:32 +02:00 
						 
				 
			
				
					
						
							
							
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							02c5c114d0 
							
						 
					 
					
						
						
							
							Fix usage of deprecated freqs.txt in init-model  
						
						
						
					 
					
						2018-08-14 13:19:15 +02:00 
						 
				 
			
				
					
						
							
							
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							4336397ecb 
							
						 
					 
					
						
						
							
							Update develop from master  
						
						
						
					 
					
						2018-08-14 03:04:28 +02:00 
						 
				 
			
				
					
						
							
							
								Xiaoquan Kong 
							
						 
					 
					
						
						
						
						
							
						
						
							f0c9652ed1 
							
						 
					 
					
						
						
							
							New Feature: display more detail when Error E067 ( #2639 )  
						
						... 
						
						
						
						* Fix off-by-one error
* Add verbose option
* Update verbose option
* Update documents for verbose option 
						
					 
					
						2018-08-07 10:45:29 +02:00 
						 
				 
			
				
					
						
							
							
								Kaisa (Katarzyna) Korsak 
							
						 
					 
					
						
						
						
						
							
						
						
							e531a827db 
							
						 
					 
					
						
						
							
							Changed conllu2json to be able to extract NER tags ( #2594 )  
						
						... 
						
						
						
						* extract ner tags from conllu file if available
* fixed a bug in regex 
						
					 
					
						2018-07-25 22:21:31 +02:00 
						 
				 
			
				
					
						
							
							
								ines 
							
						 
					 
					
						
						
						
						
							
						
						
							d84b13e02c 
							
						 
					 
					
						
						
							
							Merge branch 'master' into develop  
						
						
						
					 
					
						2018-07-18 18:57:00 +02:00 
						 
				 
			
				
					
						
							
							
								Ole Henrik Skogstrøm 
							
						 
					 
					
						
						
						
						
							
						
						
							6e2930a4a2 
							
						 
					 
					
						
						
							
							Conll(u)-bio converter ( #2525 )  
						
						... 
						
						
						
						* Started simple conllxbiluo converter
* Fix missing BIO to BILUO conversion 
						
					 
					
						2018-07-18 18:55:42 +02:00 
						 
				 
			
				
					
						
							
							
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							8ae1bec8bf 
							
						 
					 
					
						
						
							
							Fix init_model  
						
						
						
					 
					
						2018-07-05 14:02:06 +02:00 
						 
				 
			
				
					
						
							
							
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							dee8bdb900 
							
						 
					 
					
						
						
							
							Fix init-model for npz vectors  
						
						
						
					 
					
						2018-07-04 02:29:48 +02:00 
						 
				 
			
				
					
						
							
							
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							59d655e8d0 
							
						 
					 
					
						
						
							
							Fix model init from jsonl  
						
						
						
					 
					
						2018-07-04 01:30:40 +02:00 
						 
				 
			
				
					
						
							
							
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							1e38bea6e9 
							
						 
					 
					
						
						
							
							Save vectors init  
						
						
						
					 
					
						2018-07-03 23:55:04 +02:00 
						 
				 
			
				
					
						
							
							
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							6692833887 
							
						 
					 
					
						
						
							
							Fix init_model  
						
						
						
					 
					
						2018-07-03 23:24:11 +02:00 
						 
				 
			
				
					
						
							
							
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							4a38a26cb5 
							
						 
					 
					
						
						
							
							Fix init_model  
						
						
						
					 
					
						2018-07-03 22:57:11 +02:00 
						 
				 
			
				
					
						
							
							
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							019d09e3c3 
							
						 
					 
					
						
						
							
							Fix init model  
						
						
						
					 
					
						2018-07-03 22:16:44 +02:00 
						 
				 
			
				
					
						
							
							
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							2543f8c93a 
							
						 
					 
					
						
						
							
							Support .npz vectors in init-model command  
						
						
						
					 
					
						2018-07-03 21:42:16 +02:00 
						 
				 
			
				
					
						
							
							
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							86aad11939 
							
						 
					 
					
						
						
							
							Fix init_model arg  
						
						
						
					 
					
						2018-07-03 17:00:42 +02:00 
						 
				 
			
				
					
						
							
							
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							eff42d36e3 
							
						 
					 
					
						
						
							
							Fix init model command  
						
						
						
					 
					
						2018-07-03 16:32:23 +02:00 
						 
				 
			
				
					
						
							
							
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							6a89faf12e 
							
						 
					 
					
						
						
							
							Add support for jsonl-formatted lexical attributes to init-model command.  
						
						
						
					 
					
						2018-07-03 12:22:56 +02:00 
						 
				 
			
				
					
						
							
							
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							c83fccfe2a 
							
						 
					 
					
						
						
							
							Fix output of best model  
						
						
						
					 
					
						2018-06-25 23:05:56 +02:00 
						 
				 
			
				
					
						
							
							
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							69c900f003 
							
						 
					 
					
						
						
							
							Fix init-model if no vectors provided  
						
						
						
					 
					
						2018-06-25 18:26:02 +02:00 
						 
				 
			
				
					
						
							
							
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							664f89327a 
							
						 
					 
					
						
						
							
							Fix init-model if no vectors provided  
						
						
						
					 
					
						2018-06-25 17:58:45 +02:00 
						 
				 
			
				
					
						
							
							
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							c4698f5712 
							
						 
					 
					
						
						
							
							Don't collate model unless training succeeds  
						
						
						
					 
					
						2018-06-25 16:36:42 +02:00 
						 
				 
			
				
					
						
							
							
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							24dfbb8a28 
							
						 
					 
					
						
						
							
							Fix model collation  
						
						
						
					 
					
						2018-06-25 14:35:24 +02:00 
						 
				 
			
				
					
						
							
							
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							62237755a4 
							
						 
					 
					
						
						
							
							Import shutil  
						
						
						
					 
					
						2018-06-25 13:40:17 +02:00 
						 
				 
			
				
					
						
							
							
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							a040fca99e 
							
						 
					 
					
						
						
							
							Import json into cli.train  
						
						
						
					 
					
						2018-06-25 11:50:37 +02:00 
						 
				 
			
				
					
						
							
							
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							2c703d99c2 
							
						 
					 
					
						
						
							
							Fix collation of best models  
						
						
						
					 
					
						2018-06-25 01:21:34 +02:00 
						 
				 
			
				
					
						
							
							
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							2c80b7c013 
							
						 
					 
					
						
						
							
							Collate best model after training  
						
						
						
					 
					
						2018-06-24 23:39:52 +02:00 
						 
				 
			
				
					
						
							
							
								ines 
							
						 
					 
					
						
						
						
						
							
						
						
							330c039106 
							
						 
					 
					
						
						
							
							Merge branch 'master' into develop  
						
						
						
					 
					
						2018-05-26 18:30:52 +02:00 
						 
				 
			
				
					
						
							
							
								James Messinger 
							
						 
					 
					
						
						
						
						
							
						
						
							4515e96e90 
							
						 
					 
					
						
						
							
							Better formatting for spacy train CLI ( #2357 )  
						
						... 
						
						
						
						* Better formatting for `spacy train` CLI
Changed to use fixed-spaces rather than tabs to align table headers and data.
### Before:
```
Itn.    P.Loss  N.Loss  UAS     NER P.  NER R.  NER F.  Tag %   Token %
0       4618.857        2910.004        76.172  79.645  67.987  88.732  88.261  100.000 4436.9  6376.4
1       4671.972        3764.812        74.481  78.046  62.374  82.680  88.377  100.000 4672.2  6227.1
2       4742.756        3673.473        71.994  77.380  63.966  84.494  90.620  100.000 4298.0  5983.9
```
### After:
```
Itn.  Dep Loss  NER Loss  UAS     NER P.  NER R.  NER F.  Tag %   Token %  CPU WPS  GPU WPS
0     4618.857  2910.004  76.172  79.645  67.987  88.732  88.261  100.000  4436.9   6376.4
1     4671.972  3764.812  74.481  78.046  62.374  82.680  88.377  100.000  4672.2   6227.1
2     4742.756  3673.473  71.994  77.380  63.966  84.494  90.620  100.000  4298.0   5983.9
```
* Added contributor file 
						
					 
					
						2018-05-25 13:08:45 +02:00 
						 
				 
			
				
					
						
							
							
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							ce458c2428 
							
						 
					 
					
						
						
							
							Fix spacy requirement constraint in package template  
						
						
						
					 
					
						2018-05-22 20:50:46 +02:00 
						 
				 
			
				
					
						
							
							
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							f3b4f6a4ec 
							
						 
					 
					
						
						
							
							Merge setup.py  
						
						
						
					 
					
						2018-05-20 23:21:00 +02:00 
						 
				 
			
				
					
						
							
							
								Ines Montani 
							
						 
					 
					
						
						
						
						
							
						
						
							d4cc736b7c 
							
						 
					 
					
						
						
							
							💫  Improve model downloads: check for existing install, customise pip and use requests library again ( #2346 )  
						
						... 
						
						
						
						* Go back to using requests instead of urllib (closes  #2320 )
Fewer dependencies are good, but this one was simply causing too many other problems around SSL verification and Python 2/3 compatibility. requests is a popular enough package that it's okay for spaCy to depend on it – and this will hopefully make model downloads less flakey.
* Only download model if not installed (see #1456 )
Use #egg=model==version to allow pip to check for existing installations. The download is only started if no installation matching the package/version is found. Fixes a long-standing inconvenience.
* Pass additional options to pip when installing model (resolves  #1456 )
Treat all additional arguments passed to the download command as pip options to allow user to customise the command. For example:
python -m spacy download en --user
* Add CLI option to enable installing model package dependencies
* Revert "Add CLI option to enable installing model package dependencies"
This reverts commit 9336ffe695 
						
					 
					
						2018-05-20 20:26:56 +02:00 
						 
				 
			
				
					
						
							
							
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							74d5c625b3 
							
						 
					 
					
						
						
							
							Use rising beam update prob  
						
						
						
					 
					
						2018-05-16 20:11:59 +02:00 
						 
				 
			
				
					
						
							
							
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							dc1a479fbd 
							
						 
					 
					
						
						
							
							Merge branch 'develop' into feature/refactor-parser  
						
						
						
					 
					
						2018-05-15 18:39:21 +02:00 
						 
				 
			
				
					
						
							
							
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							546dd99cdf 
							
						 
					 
					
						
						
							
							Merge master into develop -- mostly Arabic and website  
						
						
						
					 
					
						2018-05-15 18:14:28 +02:00 
						 
				 
			
				
					
						
							
							
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							a6ae1ee6f7 
							
						 
					 
					
						
						
							
							Don't modify Token in global scope  
						
						
						
					 
					
						2018-05-09 00:43:00 +02:00 
						 
				 
			
				
					
						
							
							
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							f94f721f40 
							
						 
					 
					
						
						
							
							Avoid importing fused token symbol in ud-run-test, untl that's added  
						
						
						
					 
					
						2018-05-09 00:28:03 +02:00 
						 
				 
			
				
					
						
							
							
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							659ec5b975 
							
						 
					 
					
						
						
							
							Avoid importing fused token symbol in ud-run-test, untl that's added  
						
						
						
					 
					
						2018-05-08 19:40:33 +02:00 
						 
				 
			
				
					
						
							
							
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							fc4dd49b77 
							
						 
					 
					
						
						
							
							Support oracle segmentation in ud-train CLI command  
						
						
						
					 
					
						2018-05-08 13:47:45 +02:00 
						 
				 
			
				
					
						
							
							
								ines 
							
						 
					 
					
						
						
						
						
							
						
						
							7a3599c21a 
							
						 
					 
					
						
						
							
							Fix formatting and consistency  
						
						
						
					 
					
						2018-05-07 23:02:11 +02:00 
						 
				 
			
				
					
						
							
							
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							eddc0e0c74 
							
						 
					 
					
						
						
							
							Set gold.sent_starts in ud_train  
						
						
						
					 
					
						2018-05-07 15:52:47 +02:00 
						 
				 
			
				
					
						
							
							
								G.Pruvost 
							
						 
					 
					
						
						
						
						
							
						
						
							cc8e804648 
							
						 
					 
					
						
						
							
							#2211  - Support for ssl certs config on download command ( #2212 )  
						
						... 
						
						
						
						* Add support for SSL/Certs customization on download CLI
* Add a note on SSL options for the 'download' CLI in the README
* Add contributor agreement 
						
					 
					
						2018-05-03 18:37:02 +02:00 
						 
				 
			
				
					
						
							
							
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							723b328062 
							
						 
					 
					
						
						
							
							Add script to run UD test  
						
						
						
					 
					
						2018-04-29 15:50:25 +02:00 
						 
				 
			
				
					
						
							
							
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							17af6aa3a4 
							
						 
					 
					
						
						
							
							Update ud_train script  
						
						
						
					 
					
						2018-04-29 15:49:32 +02:00 
						 
				 
			
				
					
						
							
							
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							2c4a6d66fa 
							
						 
					 
					
						
						
							
							Merge master into develop. Big merge, many conflicts -- need to review  
						
						
						
					 
					
						2018-04-29 14:49:26 +02:00 
						 
				 
			
				
					
						
							
							
								ines 
							
						 
					 
					
						
						
						
						
							
						
						
							3c80f69ff5 
							
						 
					 
					
						
						
							
							Return data in cli.info and add silent option ( resolves   #2196 )  
						
						
						
					 
					
						2018-04-29 01:59:44 +02:00 
						 
				 
			
				
					
						
							
							
								ines 
							
						 
					 
					
						
						
						
						
							
						
						
							0299d5fac8 
							
						 
					 
					
						
						
							
							Update argument annotations and formatting  
						
						
						
					 
					
						2018-04-10 21:45:11 +02:00 
						 
				 
			
				
					
						
							
							
								ines 
							
						 
					 
					
						
						
						
						
							
						
						
							49b1e48bf5 
							
						 
					 
					
						
						
							
							Fix syntax error  
						
						
						
					 
					
						2018-04-10 21:44:59 +02:00 
						 
				 
			
				
					
						
							
							
								ines 
							
						 
					 
					
						
						
						
						
							
						
						
							70052e46e9 
							
						 
					 
					
						
						
							
							Fix formatting [ci skip]  
						
						
						
					 
					
						2018-04-10 21:42:46 +02:00 
						 
				 
			
				
					
						
							
							
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							0ddb152be0 
							
						 
					 
					
						
						
							
							Improve error message when reading vectors  
						
						
						
					 
					
						2018-04-10 21:26:50 +02:00 
						 
				 
			
				
					
						
							
							
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							db50ac524e 
							
						 
					 
					
						
						
							
							Support zipped vector files in init-model  
						
						
						
					 
					
						2018-04-10 21:21:00 +02:00 
						 
				 
			
				
					
						
							
							
								ines 
							
						 
					 
					
						
						
						
						
							
						
						
							270fcfd925 
							
						 
					 
					
						
						
							
							Fix typo in package command message ( closes   #2200 )  
						
						
						
					 
					
						2018-04-10 19:14:31 +02:00 
						 
				 
			
				
					
						
							
							
								ines 
							
						 
					 
					
						
						
						
						
							
						
						
							24d8bf348d 
							
						 
					 
					
						
						
							
							Revert "Add support for .zip to init_model"  
						
						... 
						
						
						
						This reverts commit 7ee880a0ad 
						
					 
					
						2018-04-10 19:08:06 +02:00 
						 
				 
			
				
					
						
							
							
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							7ee880a0ad 
							
						 
					 
					
						
						
							
							Add support for .zip to init_model  
						
						
						
					 
					
						2018-04-10 14:30:04 +00:00 
						 
				 
			
				
					
						
							
							
								Ines Montani 
							
						 
					 
					
						
						
							
							
						
						
						
							
						
						
							3141e04822 
							
						 
					 
					
						
						
							
							💫  New system for error messages and warnings ( #2163 )  
						
						... 
						
						
						
						* Add spacy.errors module
* Update deprecation and user warnings
* Replace errors and asserts with new error message system
* Remove redundant asserts
* Fix whitespace
* Add messages for print/util.prints statements
* Fix typo
* Fix typos
* Move CLI messages to spacy.cli._messages
* Add decorator to display error code with message
An implementation like this is nice because it only modifies the string when it's retrieved from the containing class – so we don't have to worry about manipulating tracebacks etc.
* Remove unused link in spacy.about
* Update errors for invalid pipeline components
* Improve error for unknown factories
* Add displaCy warnings
* Update formatting consistency
* Move error message to spacy.errors
* Update errors and check if doc returned by component is None 
						
					 
					
						2018-04-03 15:50:31 +02:00 
						 
				 
			
				
					
						
							
							
								Ines Montani 
							
						 
					 
					
						
						
							
							
						
						
						
							
						
						
							a609a1ca29 
							
						 
					 
					
						
						
							
							Merge pull request  #2152  from explosion/feature/tidy-up-dependencies  
						
						... 
						
						
						
						💫  Tidy up dependencies 
					
						2018-03-29 14:35:09 +02:00 
						 
				 
			
				
					
						
							
							
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							b5098079d8 
							
						 
					 
					
						
						
							
							Fix error on urllib  
						
						
						
					 
					
						2018-03-29 00:08:16 +02:00 
						 
				 
			
				
					
						
							
							
								Ines Montani 
							
						 
					 
					
						
						
							
							
						
						
						
							
						
						
							98e9cda677 
							
						 
					 
					
						
						
							
							Merge pull request  #2158  from explosion/feature/fix-multiple-vectors ( resolves   #1660 )  
						
						... 
						
						
						
						💫  Fix loading of multiple vector models 
					
						2018-03-28 23:08:24 +02:00 
						 
				 
			
				
					
						
							
							
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							17c3e7efa2 
							
						 
					 
					
						
						
							
							Add message noting vectors  
						
						
						
					 
					
						2018-03-28 16:33:43 +02:00 
						 
				 
			
				
					
						
							
							
								ines 
							
						 
					 
					
						
						
						
						
							
						
						
							7fbc9e5874 
							
						 
					 
					
						
						
							
							Replace requests with urllib  
						
						
						
					 
					
						2018-03-28 12:46:07 +02:00 
						 
				 
			
				
					
						
							
							
								ines 
							
						 
					 
					
						
						
						
						
							
						
						
							ac88c72c9a 
							
						 
					 
					
						
						
							
							Fix ftfy workaround and remove old import  
						
						
						
					 
					
						2018-03-28 12:14:28 +02:00 
						 
				 
			
				
					
						
							
							
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							070b6c6495 
							
						 
					 
					
						
						
							
							Remove dependency on ftfy  
						
						
						
					 
					
						2018-03-28 12:07:02 +02:00 
						 
				 
			
				
					
						
							
							
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							b7136cb094 
							
						 
					 
					
						
						
							
							Support zipped vector files in init-model  
						
						
						
					 
					
						2018-03-27 21:01:18 +00:00 
						 
				 
			
				
					
						
							
							
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							1f7229f40f 
							
						 
					 
					
						
						
							
							Revert "Merge branch 'develop' of  https://github.com/explosion/spaCy  into develop"  
						
						... 
						
						
						
						This reverts commit c9ba3d3c2d92c26a35d4 
						
					 
					
						2018-03-27 19:23:02 +02:00 
						 
				 
			
				
					
						
							
							
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							f57bfbccdc 
							
						 
					 
					
						
						
							
							Fix non-projective label filtering  
						
						
						
					 
					
						2018-03-27 13:41:33 +02:00 
						 
				 
			
				
					
						
							
							
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							8bbd26579c 
							
						 
					 
					
						
						
							
							Support GPU in UD training script  
						
						
						
					 
					
						2018-03-27 09:53:35 +00:00 
						 
				 
			
				
					
						
							
							
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							406548b976 
							
						 
					 
					
						
						
							
							Support .gz and .tar.gz files in spacy init-model  
						
						
						
					 
					
						2018-03-24 17:18:32 +01:00 
						 
				 
			
				
					
						
							
							
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							85717f570c 
							
						 
					 
					
						
						
							
							Merge branch 'master' of  https://github.com/explosion/spaCy  
						
						
						
					 
					
						2018-03-23 20:30:42 +01:00 
						 
				 
			
				
					
						
							
							
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							8902754f0b 
							
						 
					 
					
						
						
							
							Fix vector loading for ud_train  
						
						
						
					 
					
						2018-03-23 20:30:00 +01:00 
						 
				 
			
				
					
						
							
							
								Xiaoquan Kong 
							
						 
					 
					
						
						
						
						
							
						
						
							a71b99d7ff 
							
						 
					 
					
						
						
							
							bugfix for global-variable-change-in-runtime related issue ( #2135 )  
						
						... 
						
						
						
						* Bugfix: setting pollution from spacy/cli/ud_train.py to whole package
* Add contributor agreement of howl-anderson 
						
					 
					
						2018-03-23 11:36:38 +01:00 
						 
				 
			
				
					
						
							
							
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							044397e269 
							
						 
					 
					
						
						
							
							Support .gz and .tar.gz files in spacy init-model  
						
						
						
					 
					
						2018-03-21 14:33:23 +01:00 
						 
				 
			
				
					
						
							
							
								Matthew Honnibal 
							
						 
					 
					
						
						
							
							
						
						
						
							
						
						
							bede11b67c 
							
						 
					 
					
						
						
							
							Improve label management in parser and NER ( #2108 )  
						
						... 
						
						
						
						This patch does a few smallish things that tighten up the training workflow a little, and allow memory use during training to be reduced by letting the GoldCorpus stream data properly.
Previously, the parser and entity recognizer read and saved labels as lists, with extra labels noted separately. Lists were used becaue ordering is very important, to ensure that the label-to-class mapping is stable.
We now manage labels as nested dictionaries, first keyed by the action, and then keyed by the label. Values are frequencies. The trick is, how do we save new labels? We need to make sure we iterate over these in the same order they're added. Otherwise, we'll get different class IDs, and the model's predictions won't make sense.
To allow stable sorting, we map the new labels to negative values. If we have two new labels, they'll be noted as having "frequency" -1 and -2. The next new label will then have "frequency" -3. When we sort by (frequency, label), we then get a stable sort.
Storing frequencies then allows us to make the next nice improvement. Previously we had to iterate over the whole training set, to pre-process it for the deprojectivisation. This led to storing the whole training set in memory. This was most of the required memory during training.
To prevent this, we now store the frequencies as we stream in the data, and deprojectivize as we go. Once we've built the frequencies, we can then apply a frequency cut-off when we decide how many classes to make.
Finally, to allow proper data streaming, we also have to have some way of shuffling the iterator. This is awkward if the training files have multiple documents in them. To solve this, the GoldCorpus class now writes the training data to disk in msgpack files, one per document. We can then shuffle the data by shuffling the paths.
This is a squash merge, as I made a lot of very small commits. Individual commit messages below.
* Simplify label management for TransitionSystem and its subclasses
* Fix serialization for new label handling format in parser
* Simplify and improve GoldCorpus class. Reduce memory use, write to temp dir
* Set actions in transition system
* Require thinc 6.11.1.dev4
* Fix error in parser init
* Add unicode declaration
* Fix unicode declaration
* Update textcat test
* Try to get model training on less memory
* Print json loc for now
* Try rapidjson to reduce memory use
* Remove rapidjson requirement
* Try rapidjson for reduced mem usage
* Handle None heads when projectivising
* Stream json docs
* Fix train script
* Handle projectivity in GoldParse
* Fix projectivity handling
* Add minibatch_by_words util from ud_train
* Minibatch by number of words in spacy.cli.train
* Move minibatch_by_words util to spacy.util
* Fix label handling
* More hacking at label management in parser
* Fix encoding in msgpack serialization in GoldParse
* Adjust batch sizes in parser training
* Fix minibatch_by_words
* Add merge_subtokens function to pipeline.pyx
* Register merge_subtokens factory
* Restore use of msgpack tmp directory
* Use minibatch-by-words in train
* Handle retokenization in scorer
* Change back-off approach for missing labels. Use 'dep' label
* Update NER for new label management
* Set NER tags for over-segmented words
* Fix label alignment in gold
* Fix label back-off for infrequent labels
* Fix int type in labels dict key
* Fix int type in labels dict key
* Update feature definition for 8 feature set
* Update ud-train script for new label stuff
* Fix json streamer
* Print the line number if conll eval fails
* Update children and sentence boundaries after deprojectivisation
* Export set_children_from_heads from doc.pxd
* Render parses during UD training
* Remove print statement
* Require thinc 6.11.1.dev6. Try adding wheel as install_requires
* Set different dev version, to flush pip cache
* Update thinc version
* Update GoldCorpus docs
* Remove print statements
* Fix formatting and links [ci skip] 
						
					 
					
						2018-03-19 02:58:08 +01:00 
						 
				 
			
				
					
						
							
							
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							d7ce6527fb 
							
						 
					 
					
						
						
							
							Use increasing batch sizes in ud-train  
						
						
						
					 
					
						2018-03-14 20:15:28 +01:00 
						 
				 
			
				
					
						
							
							
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							5dddb30e5b 
							
						 
					 
					
						
						
							
							Fix ud-train script  
						
						
						
					 
					
						2018-03-11 01:26:45 +01:00 
						 
				 
			
				
					
						
							
							
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							2cab4d6517 
							
						 
					 
					
						
						
							
							Remove use of attr module in ud_train  
						
						
						
					 
					
						2018-03-11 00:59:39 +01:00 
						 
				 
			
				
					
						
							
							
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							754ea1b2f7 
							
						 
					 
					
						
						
							
							Link in spaCy CoNLL commands  
						
						
						
					 
					
						2018-03-10 23:42:15 +01:00 
						 
				 
			
				
					
						
							
							
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							3478ea76d1 
							
						 
					 
					
						
						
							
							Add ud_train and ud_evaluate CLI commands  
						
						
						
					 
					
						2018-03-10 23:41:55 +01:00 
						 
				 
			
				
					
						
							
							
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							b59765ca9f 
							
						 
					 
					
						
						
							
							Stream gold during spacy train  
						
						
						
					 
					
						2018-03-10 22:32:45 +01:00 
						 
				 
			
				
					
						
							
							
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							86405e4ad1 
							
						 
					 
					
						
						
							
							Fix CLI for multitask objectives  
						
						
						
					 
					
						2018-02-18 10:59:11 +01:00 
						 
				 
			
				
					
						
							
							
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							a34749b2bf 
							
						 
					 
					
						
						
							
							Add multitask objectives options to train CLI  
						
						
						
					 
					
						2018-02-17 22:03:54 +01:00 
						 
				 
			
				
					
						
							
							
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							262d0a3148 
							
						 
					 
					
						
						
							
							Fix overwriting of lexical attributes when loading vectors during training  
						
						
						
					 
					
						2018-02-17 18:11:11 +01:00 
						 
				 
			
				
					
						
							
							
								Johannes Dollinger 
							
						 
					 
					
						
						
						
						
							
						
						
							bf94c13382 
							
						 
					 
					
						
						
							
							Don't fix random seeds on import  
						
						
						
					 
					
						2018-02-13 12:42:23 +01:00 
						 
				 
			
				
					
						
							
							
								Ali Zarezade 
							
						 
					 
					
						
						
							
							
						
						
						
							
						
						
							9df9da34a3 
							
						 
					 
					
						
						
							
							Fix init_model issue  
						
						... 
						
						
						
						Fixing issue #1928  
						
					 
					
						2018-02-03 17:21:34 +03:30 
						 
				 
			
				
					
						
							
							
								ines 
							
						 
					 
					
						
						
						
						
							
						
						
							3c1fb9d02d 
							
						 
					 
					
						
						
							
							Make validate command fail more gracefully if version not found  
						
						... 
						
						
						
						Mostly relevant during develoment when working with .dev versions 
						
					 
					
						2018-01-31 22:06:28 +01:00 
						 
				 
			
				
					
						
							
							
								Adam Binford 
							
						 
					 
					
						
						
						
						
							
						
						
							1a2c2f7d7f 
							
						 
					 
					
						
						
							
							Fixed auto linking after download and added simple test to check  
						
						
						
					 
					
						2018-01-29 14:25:21 -05:00 
						 
				 
			
				
					
						
							
							
								Matthew Honnibal 
							
						 
					 
					
						
						
							
							
						
						
						
							
						
						
							7ca49c2061 
							
						 
					 
					
						
						
							
							Merge branch 'master' into feature-improve-model-download  
						
						
						
					 
					
						2018-01-10 18:21:55 +01:00 
						 
				 
			
				
					
						
							
							
								Søren Lind Kristiansen 
							
						 
					 
					
						
						
						
						
							
						
						
							10dab8eef8 
							
						 
					 
					
						
						
							
							Remove dummy variable from function calls  
						
						
						
					 
					
						2018-01-05 09:37:05 +01:00 
						 
				 
			
				
					
						
							
							
								Søren Lind Kristiansen 
							
						 
					 
					
						
						
						
						
							
						
						
							7f0ab145e9 
							
						 
					 
					
						
						
							
							Don't pass CLI command name as dummy argument  
						
						
						
					 
					
						2018-01-04 21:33:47 +01:00 
						 
				 
			
				
					
						
							
							
								ines 
							
						 
					 
					
						
						
						
						
							
						
						
							2c656f90fb 
							
						 
					 
					
						
						
							
							Exit with 1 if incompatible models found (see  #1714 )  
						
						
						
					 
					
						2018-01-03 21:20:35 +01:00 
						 
				 
			
				
					
						
							
							
								ines 
							
						 
					 
					
						
						
						
						
							
						
						
							dacfaa2ca4 
							
						 
					 
					
						
						
							
							Ensure that download command exits properly ( resolves   #1714 )  
						
						
						
					 
					
						2018-01-03 21:03:36 +01:00 
						 
				 
			
				
					
						
							
							
								Søren Lind Kristiansen 
							
						 
					 
					
						
						
						
						
							
						
						
							a9ff6eadc9 
							
						 
					 
					
						
						
							
							Prefix dummy argument names with underscore  
						
						
						
					 
					
						2018-01-03 20:48:12 +01:00 
						 
				 
			
				
					
						
							
							
								ines 
							
						 
					 
					
						
						
						
						
							
						
						
							1081e08efb 
							
						 
					 
					
						
						
							
							Fix formatting  
						
						
						
					 
					
						2018-01-03 20:14:50 +01:00 
						 
				 
			
				
					
						
							
							
								ines 
							
						 
					 
					
						
						
						
						
							
						
						
							d8109964d6 
							
						 
					 
					
						
						
							
							Use --no-deps on model install  
						
						... 
						
						
						
						In general, it's nice for models to specify spaCy as a dependency. However, this tends to cause problems in conda environments, as pip will re-install spaCy and its dependencies (especially Thinc) 
						
					 
					
						2018-01-03 17:40:37 +01:00 
						 
				 
			
				
					
						
							
							
								ines 
							
						 
					 
					
						
						
						
						
							
						
						
							319d754309 
							
						 
					 
					
						
						
							
							Fix overwriting of existing symlinks  
						
						... 
						
						
						
						Check for is_symlink() to also overwrite invalid and outdated symlinks. Also show better error message if link path exists but is not symlink (i.e. file or directory). 
						
					 
					
						2018-01-03 17:39:36 +01:00 
						 
				 
			
				
					
						
							
							
								ines 
							
						 
					 
					
						
						
						
						
							
						
						
							8ba0dfd017 
							
						 
					 
					
						
						
							
							Make message on failed linking more clear  
						
						
						
					 
					
						2018-01-03 17:38:09 +01:00 
						 
				 
			
				
					
						
							
							
								Søren Lind Kristiansen 
							
						 
					 
					
						
						
						
						
							
						
						
							d6327e8495 
							
						 
					 
					
						
						
							
							Fix handling case when vectors not specified  
						
						
						
					 
					
						2018-01-03 12:20:49 +01:00 
						 
				 
			
				
					
						
							
							
								Søren Lind Kristiansen 
							
						 
					 
					
						
						
						
						
							
						
						
							bcc51d7d8b 
							
						 
					 
					
						
						
							
							Fix shifted positional arguments  
						
						
						
					 
					
						2018-01-03 12:19:47 +01:00 
						 
				 
			
				
					
						
							
							
								Søren Lind Kristiansen 
							
						 
					 
					
						
						
						
						
							
						
						
							5a9d377580 
							
						 
					 
					
						
						
							
							Remove abbreviation for positional plac argument  
						
						
						
					 
					
						2017-12-11 11:08:29 +01:00 
						 
				 
			
				
					
						
							
							
								Isaac Sijaranamual 
							
						 
					 
					
						
						
						
						
							
						
						
							20ae0c459a 
							
						 
					 
					
						
						
							
							Fixes "Error saving model"  #1622  
						
						
						
					 
					
						2017-12-10 23:07:13 +01:00 
						 
				 
			
				
					
						
							
							
								Isaac Sijaranamual 
							
						 
					 
					
						
						
						
						
							
						
						
							e188b61960 
							
						 
					 
					
						
						
							
							Make cli/train.py not eat exception  
						
						
						
					 
					
						2017-12-10 22:53:08 +01:00 
						 
				 
			
				
					
						
							
							
								ines 
							
						 
					 
					
						
						
						
						
							
						
						
							5eaa61c2b8 
							
						 
					 
					
						
						
							
							Fix formatting  
						
						
						
					 
					
						2017-12-07 10:23:09 +01:00 
						 
				 
			
				
					
						
							
							
								ines 
							
						 
					 
					
						
						
						
						
							
						
						
							24e80c51b8 
							
						 
					 
					
						
						
							
							Document init-model command  
						
						
						
					 
					
						2017-12-07 10:14:37 +01:00 
						 
				 
			
				
					
						
							
							
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							c91f451b0f 
							
						 
					 
					
						
						
							
							Fix imports and CLI in init-model  
						
						
						
					 
					
						2017-12-07 10:03:07 +01:00 
						 
				 
			
				
					
						
							
							
								ines 
							
						 
					 
					
						
						
						
						
							
						
						
							82e80ff928 
							
						 
					 
					
						
						
							
							Rename model command to init_model and fix formatting  
						
						
						
					 
					
						2017-12-07 09:59:23 +01:00 
						 
				 
			
				
					
						
							
							
								Ines Montani 
							
						 
					 
					
						
						
							
							
						
						
						
							
						
						
							2feeb428d6 
							
						 
					 
					
						
						
							
							Merge pull request  #1646  from GreenRiverRUS/master  
						
						... 
						
						
						
						Added model command to create models from raw data 
						
					 
					
						2017-12-07 08:54:26 +00:00 
						 
				 
			
				
					
						
							
							
								Thomas Werkmeister 
							
						 
					 
					
						
						
							
							
						
						
						
							
						
						
							94eac75b7c 
							
						 
					 
					
						
						
							
							fix setup.py spacy req string for packaging  
						
						... 
						
						
						
						Requirement should be `spacy>=2.0.2` instead of `spacy2.0.2` 
						
					 
					
						2017-12-03 04:16:28 -06:00 
						 
				 
			
				
					
						
							
							
								Vadim Mazaev 
							
						 
					 
					
						
						
						
						
							
						
						
							495eacf470 
							
						 
					 
					
						
						
							
							Merge branch 'model_command'  
						
						
						
					 
					
						2017-11-30 12:30:26 +03:00 
						 
				 
			
				
					
						
							
							
								Vadim Mazaev 
							
						 
					 
					
						
						
						
						
							
						
						
							c332ffdde1 
							
						 
					 
					
						
						
							
							Added model command to create model from raw data:  
						
						... 
						
						
						
						words counts, brown clusters and vectors 
						
					 
					
						2017-11-27 01:21:47 +03:00 
						 
				 
			
				
					
						
							
							
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							2acc907d55 
							
						 
					 
					
						
						
							
							Improve profiling  
						
						
						
					 
					
						2017-11-23 12:33:03 +00:00 
						 
				 
			
				
					
						
							
							
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							8d692771f6 
							
						 
					 
					
						
						
							
							Improve profiling  
						
						
						
					 
					
						2017-11-15 13:51:25 +01:00 
						 
				 
			
				
					
						
							
							
								ines 
							
						 
					 
					
						
						
						
						
							
						
						
							4c5d2c80d5 
							
						 
					 
					
						
						
							
							Re-add python -m to commands, too brittle :( (see  #1536 )  
						
						
						
					 
					
						2017-11-10 02:30:55 +01:00 
						 
				 
			
				
					
						
							
							
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							de45702bbe 
							
						 
					 
					
						
						
							
							Strip dev suffixes from version for compatibility check  
						
						
						
					 
					
						2017-11-08 18:40:21 +01:00 
						 
				 
			
				
					
						
							
							
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							a2f980de4e 
							
						 
					 
					
						
						
							
							Exclude .devN versioning from compatibility check  
						
						
						
					 
					
						2017-11-08 18:03:52 +01:00 
						 
				 
			
				
					
						
							
							
								ines 
							
						 
					 
					
						
						
						
						
							
						
						
							a4662a31a9 
							
						 
					 
					
						
						
							
							Move model package templates to cli.package and update docs  
						
						
						
					 
					
						2017-11-07 12:15:35 +01:00 
						 
				 
			
				
					
						
							
							
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							c2bbf076a4 
							
						 
					 
					
						
						
							
							Add document length cap for training  
						
						
						
					 
					
						2017-11-03 01:54:54 +01:00 
						 
				 
			
				
					
						
							
							
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							eca41f0cf6 
							
						 
					 
					
						
						
							
							Fix filename conversion for conllu  
						
						
						
					 
					
						2017-11-01 21:26:49 +01:00 
						 
				 
			
				
					
						
							
							
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							e237472cdc 
							
						 
					 
					
						
						
							
							Fix tag and filename conversion for conllu  
						
						
						
					 
					
						2017-11-01 21:25:33 +01:00 
						 
				 
			
				
					
						
							
							
								ines 
							
						 
					 
					
						
						
						
						
							
						
						
							affd3404ab 
							
						 
					 
					
						
						
							
							Remove old model command (now "vocab")  
						
						
						
					 
					
						2017-11-01 13:14:03 +01:00 
						 
				 
			
				
					
						
							
							
								ines 
							
						 
					 
					
						
						
						
						
							
						
						
							37e62ab0e2 
							
						 
					 
					
						
						
							
							Update vector meta in meta.json  
						
						
						
					 
					
						2017-11-01 01:25:09 +01:00 
						 
				 
			
				
					
						
							
							
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							c390f2d745 
							
						 
					 
					
						
						
							
							Make it easier to pass explicit no-pruning to vocab  
						
						
						
					 
					
						2017-10-31 20:14:47 +01:00 
						 
				 
			
				
					
						
							
							
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							3659a807b0 
							
						 
					 
					
						
						
							
							Remove vector pruning arg from train CLI  
						
						
						
					 
					
						2017-10-31 19:21:05 +01:00 
						 
				 
			
				
					
						
							
							
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							59203a2e8a 
							
						 
					 
					
						
						
							
							Move vector pruning command into spacy vocab cli tool  
						
						
						
					 
					
						2017-10-31 19:10:01 +01:00 
						 
				 
			
				
					
						
							
							
								ines 
							
						 
					 
					
						
						
						
						
							
						
						
							803e41bc66 
							
						 
					 
					
						
						
							
							Merge branch 'develop' of  https://github.com/explosion/spaCy  into develop  
						
						
						
					 
					
						2017-10-30 18:39:51 +01:00 
						 
				 
			
				
					
						
							
							
								ines 
							
						 
					 
					
						
						
						
						
							
						
						
							abf8aa05d3 
							
						 
					 
					
						
						
							
							Populate --create-meta defaults from file if available  
						
						... 
						
						
						
						If meta.json is found in directory and user chooses to overwrite it, show existing data as defaults. 
						
					 
					
						2017-10-30 18:39:38 +01:00 
						 
				 
			
				
					
						
							
							
								ines 
							
						 
					 
					
						
						
						
						
							
						
						
							ce98fa7934 
							
						 
					 
					
						
						
							
							Fix formatting  
						
						
						
					 
					
						2017-10-30 18:38:55 +01:00 
						 
				 
			
				
					
						
							
							
								ines 
							
						 
					 
					
						
						
						
						
							
						
						
							98c35d2585 
							
						 
					 
					
						
						
							
							Fix spacy vocab command  
						
						
						
					 
					
						2017-10-30 18:38:41 +01:00 
						 
				 
			
				
					
						
							
							
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							e98451b5f7 
							
						 
					 
					
						
						
							
							Add -prune-vectors argument to spacy.cly.train  
						
						
						
					 
					
						2017-10-30 18:00:10 +01:00 
						 
				 
			
				
					
						
							
							
								Explosion Bot 
							
						 
					 
					
						
						
						
						
							
						
						
							05a1dd570e 
							
						 
					 
					
						
						
							
							Fix vocab script  
						
						
						
					 
					
						2017-10-30 16:19:22 +01:00 
						 
				 
			
				
					
						
							
							
								Explosion Bot 
							
						 
					 
					
						
						
						
						
							
						
						
							b46bdce8d2 
							
						 
					 
					
						
						
							
							Add missing import  
						
						
						
					 
					
						2017-10-30 16:18:10 +01:00 
						 
				 
			
				
					
						
							
							
								Explosion Bot 
							
						 
					 
					
						
						
						
						
							
						
						
							0fc1209421 
							
						 
					 
					
						
						
							
							Wire up new vocab command  
						
						
						
					 
					
						2017-10-30 16:14:50 +01:00 
						 
				 
			
				
					
						
							
							
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							64e4ff7c4b 
							
						 
					 
					
						
						
							
							Merge 'tidy-up' changes into branch. Resolve conflicts  
						
						
						
					 
					
						2017-10-28 13:16:06 +02:00 
						 
				 
			
				
					
						
							
							
								ines 
							
						 
					 
					
						
						
						
						
							
						
						
							d941fc3667 
							
						 
					 
					
						
						
							
							Tidy up CLI  
						
						
						
					 
					
						2017-10-27 14:38:39 +02:00 
						 
				 
			
				
					
						
							
							
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							531142a933 
							
						 
					 
					
						
						
							
							Merge remote-tracking branch 'origin/develop' into feature/better-parser  
						
						
						
					 
					
						2017-10-27 12:34:48 +00:00 
						 
				 
			
				
					
						
							
							
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							b9616419e1 
							
						 
					 
					
						
						
							
							Add try/except around bz2 import  
						
						
						
					 
					
						2017-10-27 01:18:05 +00:00 
						 
				 
			
				
					
						
							
							
								ines 
							
						 
					 
					
						
						
						
						
							
						
						
							11e3f19764 
							
						 
					 
					
						
						
							
							Fix vectors data added after training (see  #1457 )  
						
						
						
					 
					
						2017-10-25 16:08:26 +02:00 
						 
				 
			
				
					
						
							
							
								ines 
							
						 
					 
					
						
						
						
						
							
						
						
							057954695b 
							
						 
					 
					
						
						
							
							Read pipeline and vector data off model in --generate-meta  
						
						
						
					 
					
						2017-10-25 16:03:26 +02:00 
						 
				 
			
				
					
						
							
							
								ines 
							
						 
					 
					
						
						
						
						
							
						
						
							273e638183 
							
						 
					 
					
						
						
							
							Add vector data to model meta after training (see  #1457 )  
						
						
						
					 
					
						2017-10-25 16:03:05 +02:00 
						 
				 
			
				
					
						
							
							
								ines 
							
						 
					 
					
						
						
						
						
							
						
						
							95f6174516 
							
						 
					 
					
						
						
							
							Remove tensorizer from model pipeline example in spacy package  
						
						
						
					 
					
						2017-10-24 16:00:56 +02:00 
						 
				 
			
				
					
						
							
							
								ines 
							
						 
					 
					
						
						
						
						
							
						
						
							24512420b1 
							
						 
					 
					
						
						
							
							Show error if data_path does not exist or is None (see  #1102 )  
						
						
						
					 
					
						2017-10-19 00:53:49 +02:00 
						 
				 
			
				
					
						
							
							
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							dc01acd821 
							
						 
					 
					
						
						
							
							Escape encoding in validate function  
						
						
						
					 
					
						2017-10-12 22:23:21 +02:00 
						 
				 
			
				
					
						
							
							
								ines 
							
						 
					 
					
						
						
						
						
							
						
						
							fff1028391 
							
						 
					 
					
						
						
							
							Add validate CLI command  
						
						
						
					 
					
						2017-10-12 20:05:06 +02:00 
						 
				 
			
				
					
						
							
							
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							a955843684 
							
						 
					 
					
						
						
							
							Increase default number of epochs  
						
						
						
					 
					
						2017-10-12 13:13:01 +02:00 
						 
				 
			
				
					
						
							
							
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							acba2e1051 
							
						 
					 
					
						
						
							
							Fix metadata in training  
						
						
						
					 
					
						2017-10-11 08:55:52 +02:00 
						 
				 
			
				
					
						
							
							
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							74c2c6a58c 
							
						 
					 
					
						
						
							
							Add default name and lang to meta  
						
						
						
					 
					
						2017-10-11 08:49:12 +02:00 
						 
				 
			
				
					
						
							
							
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							5156074df1 
							
						 
					 
					
						
						
							
							Make loading code more consistent in train command  
						
						
						
					 
					
						2017-10-10 12:51:20 -05:00 
						 
				 
			
				
					
						
							
							
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							97c9b5db8b 
							
						 
					 
					
						
						
							
							Patch spacy.train for new pipeline management  
						
						
						
					 
					
						2017-10-09 23:41:16 -05:00 
						 
				 
			
				
					
						
							
							
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							a635240398 
							
						 
					 
					
						
						
							
							Add conll_ner2json converter  
						
						
						
					 
					
						2017-10-09 22:03:26 -05:00 
						 
				 
			
				
					
						
							
							
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							735d18654d 
							
						 
					 
					
						
						
							
							Add NER converter for CoNLL 2003 data  
						
						
						
					 
					
						2017-10-09 20:06:28 -05:00 
						 
				 
			
				
					
						
							
							
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							808d8740d6 
							
						 
					 
					
						
						
							
							Remove print statement  
						
						
						
					 
					
						2017-10-09 08:45:20 -05:00 
						 
				 
			
				
					
						
							
							
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							0f41b25f60 
							
						 
					 
					
						
						
							
							Add speed benchmarks to metadata  
						
						
						
					 
					
						2017-10-09 08:05:37 -05:00 
						 
				 
			
				
					
						
							
							
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							be4f0b6460 
							
						 
					 
					
						
						
							
							Update defaults  
						
						
						
					 
					
						2017-10-08 02:08:12 -05:00 
						 
				 
			
				
					
						
							
							
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							9d66a915da 
							
						 
					 
					
						
						
							
							Update training defaults  
						
						
						
					 
					
						2017-10-07 21:02:38 -05:00 
						 
				 
			
				
					
						
							
							
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							09442d25ec 
							
						 
					 
					
						
						
							
							Merge remote-tracking branch 'origin/develop' into feature/parser-history-model  
						
						
						
					 
					
						2017-10-07 07:05:04 -05:00 
						 
				 
			
				
					
						
							
							
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							f4c9a98166 
							
						 
					 
					
						
						
							
							Fix spacy evaluate command on non-GPU  
						
						
						
					 
					
						2017-10-06 13:17:47 -05:00 
						 
				 
			
				
					
						
							
							
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							c6cd81f192 
							
						 
					 
					
						
						
							
							Wrap try/except around model saving  
						
						
						
					 
					
						2017-10-05 08:14:24 -05:00 
						 
				 
			
				
					
						
							
							
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							5743b06e36 
							
						 
					 
					
						
						
							
							Wrap model saving in try/except  
						
						
						
					 
					
						2017-10-05 08:12:50 -05:00 
						 
				 
			
				
					
						
							
							
								ines 
							
						 
					 
					
						
						
						
						
							
						
						
							73ac0aa0b5 
							
						 
					 
					
						
						
							
							Update spacy evaluate and add displaCy option  
						
						
						
					 
					
						2017-10-04 00:03:15 +02:00 
						 
				 
			
				
					
						
							
							
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							f24c2e3a8a 
							
						 
					 
					
						
						
							
							Fix evaluate for non-GPU  
						
						
						
					 
					
						2017-10-03 22:47:31 +02:00 
						 
				 
			
				
					
						
							
							
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							1289187279 
							
						 
					 
					
						
						
							
							Fix circular import  
						
						
						
					 
					
						2017-10-03 09:33:21 -05:00 
						 
				 
			
				
					
						
							
							
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							a44c4c3a5b 
							
						 
					 
					
						
						
							
							Add timer to evaluate  
						
						
						
					 
					
						2017-10-03 09:15:35 -05:00 
						 
				 
			
				
					
						
							
							
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							8902df44de 
							
						 
					 
					
						
						
							
							Fix component disabling during training  
						
						
						
					 
					
						2017-10-02 21:07:23 +02:00 
						 
				 
			
				
					
						
							
							
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							c617d288d8 
							
						 
					 
					
						
						
							
							Update pipeline component names in spaCy train  
						
						
						
					 
					
						2017-10-02 17:20:19 +02:00 
						 
				 
			
				
					
						
							
							
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							f942903429 
							
						 
					 
					
						
						
							
							Improve sentence merging in iob2json  
						
						
						
					 
					
						2017-10-02 17:02:10 +02:00 
						 
				 
			
				
					
						
							
							
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							31681d20e0 
							
						 
					 
					
						
						
							
							Fix concatenation in iob2json converter  
						
						
						
					 
					
						2017-10-02 16:50:26 +02:00 
						 
				 
			
				
					
						
							
							
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							4896ce3320 
							
						 
					 
					
						
						
							
							Remove misleading comment  
						
						
						
					 
					
						2017-10-02 00:09:14 +02:00 
						 
				 
			
				
					
						
							
							
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							94df115a81 
							
						 
					 
					
						
						
							
							Merge branch 'develop' of  https://github.com/explosion/spaCy  into develop  
						
						
						
					 
					
						2017-10-01 14:06:23 -05:00 
						 
				 
			
				
					
						
							
							
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							69c7c642c2 
							
						 
					 
					
						
						
							
							Add spacy evaluate  
						
						
						
					 
					
						2017-10-01 14:05:04 -05:00 
						 
				 
			
				
					
						
							
							
								ines 
							
						 
					 
					
						
						
						
						
							
						
						
							fd1a9225d8 
							
						 
					 
					
						
						
							
							Handle conversion of pipeline components correctly  
						
						... 
						
						
						
						Allow both comma and comma + whitespace as separators 
						
					 
					
						2017-09-29 20:52:56 +02:00 
						 
				 
			
				
					
						
							
							
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							ac8481a7b0 
							
						 
					 
					
						
						
							
							Print NER loss  
						
						
						
					 
					
						2017-09-28 08:05:31 -05:00 
						 
				 
			
				
					
						
							
							
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							542ebfa498 
							
						 
					 
					
						
						
							
							Improve defaults  
						
						
						
					 
					
						2017-09-27 18:54:37 -05:00 
						 
				 
			
				
					
						
							
							
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							dcb86bdc43 
							
						 
					 
					
						
						
							
							Default batch size to 32  
						
						
						
					 
					
						2017-09-27 11:48:19 -05:00 
						 
				 
			
				
					
						
							
							
								ines 
							
						 
					 
					
						
						
						
						
							
						
						
							1ff62eaee7 
							
						 
					 
					
						
						
							
							Fix option shortcut to avoid conflict  
						
						
						
					 
					
						2017-09-26 17:59:34 +02:00 
						 
				 
			
				
					
						
							
							
								ines 
							
						 
					 
					
						
						
						
						
							
						
						
							7fdfb78141 
							
						 
					 
					
						
						
							
							Add version option to cli.train  
						
						
						
					 
					
						2017-09-26 17:34:52 +02:00 
						 
				 
			
				
					
						
							
							
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							698fc0d016 
							
						 
					 
					
						
						
							
							Remove merge artefact  
						
						
						
					 
					
						2017-09-26 08:31:37 -05:00 
						 
				 
			
				
					
						
							
							
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							defb68e94f 
							
						 
					 
					
						
						
							
							Update feature/noshare with recent develop changes  
						
						
						
					 
					
						2017-09-26 08:15:14 -05:00 
						 
				 
			
				
					
						
							
							
								ines 
							
						 
					 
					
						
						
						
						
							
						
						
							edf7e4881d 
							
						 
					 
					
						
						
							
							Add meta.json option to cli.train and add relevant properties  
						
						... 
						
						
						
						Add accuracy scores to meta.json instead of accuracy.json and replace
all relevant properties like lang, pipeline, spacy_version in existing
meta.json. If not present, also add name and version placeholders to
make it packagable. 
						
					 
					
						2017-09-25 19:00:47 +02:00 
						 
				 
			
				
					
						
							
							
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							204b58c864 
							
						 
					 
					
						
						
							
							Fix evaluation during training  
						
						
						
					 
					
						2017-09-24 05:01:03 -05:00 
						 
				 
			
				
					
						
							
							
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							dc3a623d00 
							
						 
					 
					
						
						
							
							Remove unused update_shared argument  
						
						
						
					 
					
						2017-09-24 05:00:37 -05:00 
						 
				 
			
				
					
						
							
							
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							4348c479fc 
							
						 
					 
					
						
						
							
							Merge pre-trained vectors and noshare patches  
						
						
						
					 
					
						2017-09-22 20:07:28 -05:00 
						 
				 
			
				
					
						
							
							
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							e93d43a43a 
							
						 
					 
					
						
						
							
							Fix training with preset vectors  
						
						
						
					 
					
						2017-09-22 20:00:40 -05:00 
						 
				 
			
				
					
						
							
							
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							a2357cce3f 
							
						 
					 
					
						
						
							
							Set random seed in train script  
						
						
						
					 
					
						2017-09-23 02:57:31 +02:00 
						 
				 
			
				
					
						
							
							
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							0a9016cade 
							
						 
					 
					
						
						
							
							Fix serialization during training  
						
						
						
					 
					
						2017-09-21 13:06:45 -05:00 
						 
				 
			
				
					
						
							
							
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							20193371f5 
							
						 
					 
					
						
						
							
							Don't share CNN, to reduce complexities  
						
						
						
					 
					
						2017-09-21 14:59:48 +02:00 
						 
				 
			
				
					
						
							
							
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							1d73dec8b1 
							
						 
					 
					
						
						
							
							Refactor train script  
						
						
						
					 
					
						2017-09-20 19:17:10 -05:00 
						 
				 
			
				
					
						
							
							
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							a0c4b33d03 
							
						 
					 
					
						
						
							
							Support resuming a model during spacy train  
						
						
						
					 
					
						2017-09-18 18:04:47 -05:00