Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							d9d339186b 
							
						 
					 
					
						
						
							
							Fix dropout and batch-size defaults  
						
						
						
					 
					
						2018-12-01 13:42:35 +00:00 
						 
				 
			
				
					
						
							
							
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							3139b020b5 
							
						 
					 
					
						
						
							
							Fix train script  
						
						
						
					 
					
						2018-11-30 22:17:08 +00:00 
						 
				 
			
				
					
						
							
							
								Ines Montani 
							
						 
					 
					
						
						
						
						
							
						
						
							37c7c85a86 
							
						 
					 
					
						
						
							
							💫  New JSON helpers, training data internals & CLI rewrite ( #2932 )  
						
						... 
						
						
						
						* Support nowrap setting in util.prints
* Tidy up and fix whitespace
* Simplify script and use read_jsonl helper
* Add JSON schemas (see #2928 )
* Deprecate Doc.print_tree
Will be replaced with Doc.to_json, which will produce a unified format
* Add Doc.to_json() method (see #2928 )
Converts Doc objects to JSON using the same unified format as the training data. Method also supports serializing selected custom attributes in the doc._. space.
* Remove outdated test
* Add write_json and write_jsonl helpers
* WIP: Update spacy train
* Tidy up spacy train
* WIP: Use wasabi for formatting
* Add GoldParse helpers for JSON format
* WIP: add debug-data command
* Fix typo
* Add missing import
* Update wasabi pin
* Add missing import
* 💫  Refactor CLI (#2943 )
To be merged into #2932 .
## Description
- [x] refactor CLI To use [`wasabi`](https://github.com/ines/wasabi )
- [x] use [`black`](https://github.com/ambv/black ) for auto-formatting
- [x] add `flake8` config
- [x] move all messy UD-related scripts to `cli.ud`
- [x] make converters function that take the opened file and return the converted data (instead of having them handle the IO)
### Types of change
enhancement
## Checklist
<!--- Before you submit the PR, go over this checklist and make sure you can
tick off all the boxes. [] -> [x] -->
- [x] I have submitted the spaCy Contributor Agreement.
- [x] I ran the tests, and all new and existing tests passed.
- [x] My changes don't require a change to the documentation, or if they do, I've added all required information.
* Update wasabi pin
* Delete old test
* Update errors
* Fix typo
* Tidy up and format remaining code
* Fix formatting
* Improve formatting of messages
* Auto-format remaining code
* Add tok2vec stuff to spacy.train
* Fix typo
* Update wasabi pin
* Fix path checks for when train() is called as function
* Reformat and tidy up pretrain script
* Update argument annotations
* Raise error if model language doesn't match lang
* Document new train command 
						
					 
					
						2018-11-30 20:16:14 +01:00 
						 
				 
			
				
					
						
							
							
								Matthew Honnibal 
							
						 
					 
					
						
						
							
							
						
						
						
							
						
						
							ef0820827a 
							
						 
					 
					
						
						
							
							Update hyper-parameters after NER random search ( #2972 )  
						
						... 
						
						
						
						These experiments were completed a few weeks ago, but I didn't make the PR, pending model release.
    Token vector width: 128->96
    Hidden width: 128->64
    Embed size: 5000->2000
    Dropout: 0.2->0.1
    Updated optimizer defaults (unclear how important?)
This should improve speed, model size and load time, while keeping
similar or slightly better accuracy.
The tl;dr is we prefer to prevent over-fitting by reducing model size,
rather than using more dropout. 
						
					 
					
						2018-11-27 18:49:52 +01:00 
						 
				 
			
				
					
						
							
							
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							2874b8efd8 
							
						 
					 
					
						
						
							
							Fix tok2vec loading in spacy train  
						
						
						
					 
					
						2018-11-15 23:34:54 +00:00 
						 
				 
			
				
					
						
							
							
								Matthew Honnibal 
							
						 
					 
					
						
						
							
							
						
						
						
							
						
						
							8fdb9bc278 
							
						 
					 
					
						
						
							
							💫  Add experimental ULMFit/BERT/Elmo-like pretraining  ( #2931 )  
						
						... 
						
						
						
						* Add 'spacy pretrain' command
* Fix pretrain command for Python 2
* Fix pretrain command
* Fix pretrain command 
						
					 
					
						2018-11-15 22:17:16 +01:00 
						 
				 
			
				
					
						
							
							
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							595c893791 
							
						 
					 
					
						
						
							
							Expose noise_level option in train CLI  
						
						
						
					 
					
						2018-08-16 00:41:44 +02:00 
						 
				 
			
				
					
						
							
							
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							4336397ecb 
							
						 
					 
					
						
						
							
							Update develop from master  
						
						
						
					 
					
						2018-08-14 03:04:28 +02:00 
						 
				 
			
				
					
						
							
							
								Xiaoquan Kong 
							
						 
					 
					
						
						
						
						
							
						
						
							f0c9652ed1 
							
						 
					 
					
						
						
							
							New Feature: display more detail when Error E067 ( #2639 )  
						
						... 
						
						
						
						* Fix off-by-one error
* Add verbose option
* Update verbose option
* Update documents for verbose option 
						
					 
					
						2018-08-07 10:45:29 +02:00 
						 
				 
			
				
					
						
							
							
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							c83fccfe2a 
							
						 
					 
					
						
						
							
							Fix output of best model  
						
						
						
					 
					
						2018-06-25 23:05:56 +02:00 
						 
				 
			
				
					
						
							
							
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							c4698f5712 
							
						 
					 
					
						
						
							
							Don't collate model unless training succeeds  
						
						
						
					 
					
						2018-06-25 16:36:42 +02:00 
						 
				 
			
				
					
						
							
							
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							24dfbb8a28 
							
						 
					 
					
						
						
							
							Fix model collation  
						
						
						
					 
					
						2018-06-25 14:35:24 +02:00 
						 
				 
			
				
					
						
							
							
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							62237755a4 
							
						 
					 
					
						
						
							
							Import shutil  
						
						
						
					 
					
						2018-06-25 13:40:17 +02:00 
						 
				 
			
				
					
						
							
							
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							a040fca99e 
							
						 
					 
					
						
						
							
							Import json into cli.train  
						
						
						
					 
					
						2018-06-25 11:50:37 +02:00 
						 
				 
			
				
					
						
							
							
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							2c703d99c2 
							
						 
					 
					
						
						
							
							Fix collation of best models  
						
						
						
					 
					
						2018-06-25 01:21:34 +02:00 
						 
				 
			
				
					
						
							
							
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							2c80b7c013 
							
						 
					 
					
						
						
							
							Collate best model after training  
						
						
						
					 
					
						2018-06-24 23:39:52 +02:00 
						 
				 
			
				
					
						
							
							
								ines 
							
						 
					 
					
						
						
						
						
							
						
						
							330c039106 
							
						 
					 
					
						
						
							
							Merge branch 'master' into develop  
						
						
						
					 
					
						2018-05-26 18:30:52 +02:00 
						 
				 
			
				
					
						
							
							
								James Messinger 
							
						 
					 
					
						
						
						
						
							
						
						
							4515e96e90 
							
						 
					 
					
						
						
							
							Better formatting for spacy train CLI ( #2357 )  
						
						... 
						
						
						
						* Better formatting for `spacy train` CLI
Changed to use fixed-spaces rather than tabs to align table headers and data.
### Before:
```
Itn.    P.Loss  N.Loss  UAS     NER P.  NER R.  NER F.  Tag %   Token %
0       4618.857        2910.004        76.172  79.645  67.987  88.732  88.261  100.000 4436.9  6376.4
1       4671.972        3764.812        74.481  78.046  62.374  82.680  88.377  100.000 4672.2  6227.1
2       4742.756        3673.473        71.994  77.380  63.966  84.494  90.620  100.000 4298.0  5983.9
```
### After:
```
Itn.  Dep Loss  NER Loss  UAS     NER P.  NER R.  NER F.  Tag %   Token %  CPU WPS  GPU WPS
0     4618.857  2910.004  76.172  79.645  67.987  88.732  88.261  100.000  4436.9   6376.4
1     4671.972  3764.812  74.481  78.046  62.374  82.680  88.377  100.000  4672.2   6227.1
2     4742.756  3673.473  71.994  77.380  63.966  84.494  90.620  100.000  4298.0   5983.9
```
* Added contributor file 
						
					 
					
						2018-05-25 13:08:45 +02:00 
						 
				 
			
				
					
						
							
							
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							2c4a6d66fa 
							
						 
					 
					
						
						
							
							Merge master into develop. Big merge, many conflicts -- need to review  
						
						
						
					 
					
						2018-04-29 14:49:26 +02:00 
						 
				 
			
				
					
						
							
							
								Ines Montani 
							
						 
					 
					
						
						
							
							
						
						
						
							
						
						
							3141e04822 
							
						 
					 
					
						
						
							
							💫  New system for error messages and warnings ( #2163 )  
						
						... 
						
						
						
						* Add spacy.errors module
* Update deprecation and user warnings
* Replace errors and asserts with new error message system
* Remove redundant asserts
* Fix whitespace
* Add messages for print/util.prints statements
* Fix typo
* Fix typos
* Move CLI messages to spacy.cli._messages
* Add decorator to display error code with message
An implementation like this is nice because it only modifies the string when it's retrieved from the containing class – so we don't have to worry about manipulating tracebacks etc.
* Remove unused link in spacy.about
* Update errors for invalid pipeline components
* Improve error for unknown factories
* Add displaCy warnings
* Update formatting consistency
* Move error message to spacy.errors
* Update errors and check if doc returned by component is None 
						
					 
					
						2018-04-03 15:50:31 +02:00 
						 
				 
			
				
					
						
							
							
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							17c3e7efa2 
							
						 
					 
					
						
						
							
							Add message noting vectors  
						
						
						
					 
					
						2018-03-28 16:33:43 +02:00 
						 
				 
			
				
					
						
							
							
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							1f7229f40f 
							
						 
					 
					
						
						
							
							Revert "Merge branch 'develop' of  https://github.com/explosion/spaCy  into develop"  
						
						... 
						
						
						
						This reverts commit c9ba3d3c2d92c26a35d4 
						
					 
					
						2018-03-27 19:23:02 +02:00 
						 
				 
			
				
					
						
							
							
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							86405e4ad1 
							
						 
					 
					
						
						
							
							Fix CLI for multitask objectives  
						
						
						
					 
					
						2018-02-18 10:59:11 +01:00 
						 
				 
			
				
					
						
							
							
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							a34749b2bf 
							
						 
					 
					
						
						
							
							Add multitask objectives options to train CLI  
						
						
						
					 
					
						2018-02-17 22:03:54 +01:00 
						 
				 
			
				
					
						
							
							
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							262d0a3148 
							
						 
					 
					
						
						
							
							Fix overwriting of lexical attributes when loading vectors during training  
						
						
						
					 
					
						2018-02-17 18:11:11 +01:00 
						 
				 
			
				
					
						
							
							
								Johannes Dollinger 
							
						 
					 
					
						
						
						
						
							
						
						
							bf94c13382 
							
						 
					 
					
						
						
							
							Don't fix random seeds on import  
						
						
						
					 
					
						2018-02-13 12:42:23 +01:00 
						 
				 
			
				
					
						
							
							
								Søren Lind Kristiansen 
							
						 
					 
					
						
						
						
						
							
						
						
							7f0ab145e9 
							
						 
					 
					
						
						
							
							Don't pass CLI command name as dummy argument  
						
						
						
					 
					
						2018-01-04 21:33:47 +01:00 
						 
				 
			
				
					
						
							
							
								Søren Lind Kristiansen 
							
						 
					 
					
						
						
						
						
							
						
						
							a9ff6eadc9 
							
						 
					 
					
						
						
							
							Prefix dummy argument names with underscore  
						
						
						
					 
					
						2018-01-03 20:48:12 +01:00 
						 
				 
			
				
					
						
							
							
								Isaac Sijaranamual 
							
						 
					 
					
						
						
						
						
							
						
						
							20ae0c459a 
							
						 
					 
					
						
						
							
							Fixes "Error saving model"  #1622  
						
						
						
					 
					
						2017-12-10 23:07:13 +01:00 
						 
				 
			
				
					
						
							
							
								Isaac Sijaranamual 
							
						 
					 
					
						
						
						
						
							
						
						
							e188b61960 
							
						 
					 
					
						
						
							
							Make cli/train.py not eat exception  
						
						
						
					 
					
						2017-12-10 22:53:08 +01:00 
						 
				 
			
				
					
						
							
							
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							c2bbf076a4 
							
						 
					 
					
						
						
							
							Add document length cap for training  
						
						
						
					 
					
						2017-11-03 01:54:54 +01:00 
						 
				 
			
				
					
						
							
							
								ines 
							
						 
					 
					
						
						
						
						
							
						
						
							37e62ab0e2 
							
						 
					 
					
						
						
							
							Update vector meta in meta.json  
						
						
						
					 
					
						2017-11-01 01:25:09 +01:00 
						 
				 
			
				
					
						
							
							
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							3659a807b0 
							
						 
					 
					
						
						
							
							Remove vector pruning arg from train CLI  
						
						
						
					 
					
						2017-10-31 19:21:05 +01:00 
						 
				 
			
				
					
						
							
							
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							e98451b5f7 
							
						 
					 
					
						
						
							
							Add -prune-vectors argument to spacy.cly.train  
						
						
						
					 
					
						2017-10-30 18:00:10 +01:00 
						 
				 
			
				
					
						
							
							
								ines 
							
						 
					 
					
						
						
						
						
							
						
						
							d941fc3667 
							
						 
					 
					
						
						
							
							Tidy up CLI  
						
						
						
					 
					
						2017-10-27 14:38:39 +02:00 
						 
				 
			
				
					
						
							
							
								ines 
							
						 
					 
					
						
						
						
						
							
						
						
							11e3f19764 
							
						 
					 
					
						
						
							
							Fix vectors data added after training (see  #1457 )  
						
						
						
					 
					
						2017-10-25 16:08:26 +02:00 
						 
				 
			
				
					
						
							
							
								ines 
							
						 
					 
					
						
						
						
						
							
						
						
							273e638183 
							
						 
					 
					
						
						
							
							Add vector data to model meta after training (see  #1457 )  
						
						
						
					 
					
						2017-10-25 16:03:05 +02:00 
						 
				 
			
				
					
						
							
							
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							a955843684 
							
						 
					 
					
						
						
							
							Increase default number of epochs  
						
						
						
					 
					
						2017-10-12 13:13:01 +02:00 
						 
				 
			
				
					
						
							
							
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							acba2e1051 
							
						 
					 
					
						
						
							
							Fix metadata in training  
						
						
						
					 
					
						2017-10-11 08:55:52 +02:00 
						 
				 
			
				
					
						
							
							
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							74c2c6a58c 
							
						 
					 
					
						
						
							
							Add default name and lang to meta  
						
						
						
					 
					
						2017-10-11 08:49:12 +02:00 
						 
				 
			
				
					
						
							
							
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							5156074df1 
							
						 
					 
					
						
						
							
							Make loading code more consistent in train command  
						
						
						
					 
					
						2017-10-10 12:51:20 -05:00 
						 
				 
			
				
					
						
							
							
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							97c9b5db8b 
							
						 
					 
					
						
						
							
							Patch spacy.train for new pipeline management  
						
						
						
					 
					
						2017-10-09 23:41:16 -05:00 
						 
				 
			
				
					
						
							
							
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							808d8740d6 
							
						 
					 
					
						
						
							
							Remove print statement  
						
						
						
					 
					
						2017-10-09 08:45:20 -05:00 
						 
				 
			
				
					
						
							
							
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							0f41b25f60 
							
						 
					 
					
						
						
							
							Add speed benchmarks to metadata  
						
						
						
					 
					
						2017-10-09 08:05:37 -05:00 
						 
				 
			
				
					
						
							
							
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							be4f0b6460 
							
						 
					 
					
						
						
							
							Update defaults  
						
						
						
					 
					
						2017-10-08 02:08:12 -05:00 
						 
				 
			
				
					
						
							
							
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							9d66a915da 
							
						 
					 
					
						
						
							
							Update training defaults  
						
						
						
					 
					
						2017-10-07 21:02:38 -05:00 
						 
				 
			
				
					
						
							
							
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							c6cd81f192 
							
						 
					 
					
						
						
							
							Wrap try/except around model saving  
						
						
						
					 
					
						2017-10-05 08:14:24 -05:00 
						 
				 
			
				
					
						
							
							
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							5743b06e36 
							
						 
					 
					
						
						
							
							Wrap model saving in try/except  
						
						
						
					 
					
						2017-10-05 08:12:50 -05:00 
						 
				 
			
				
					
						
							
							
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							8902df44de 
							
						 
					 
					
						
						
							
							Fix component disabling during training  
						
						
						
					 
					
						2017-10-02 21:07:23 +02:00 
						 
				 
			
				
					
						
							
							
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							c617d288d8 
							
						 
					 
					
						
						
							
							Update pipeline component names in spaCy train  
						
						
						
					 
					
						2017-10-02 17:20:19 +02:00 
						 
				 
			
				
					
						
							
							
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							ac8481a7b0 
							
						 
					 
					
						
						
							
							Print NER loss  
						
						
						
					 
					
						2017-09-28 08:05:31 -05:00 
						 
				 
			
				
					
						
							
							
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							542ebfa498 
							
						 
					 
					
						
						
							
							Improve defaults  
						
						
						
					 
					
						2017-09-27 18:54:37 -05:00 
						 
				 
			
				
					
						
							
							
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							dcb86bdc43 
							
						 
					 
					
						
						
							
							Default batch size to 32  
						
						
						
					 
					
						2017-09-27 11:48:19 -05:00 
						 
				 
			
				
					
						
							
							
								ines 
							
						 
					 
					
						
						
						
						
							
						
						
							1ff62eaee7 
							
						 
					 
					
						
						
							
							Fix option shortcut to avoid conflict  
						
						
						
					 
					
						2017-09-26 17:59:34 +02:00 
						 
				 
			
				
					
						
							
							
								ines 
							
						 
					 
					
						
						
						
						
							
						
						
							7fdfb78141 
							
						 
					 
					
						
						
							
							Add version option to cli.train  
						
						
						
					 
					
						2017-09-26 17:34:52 +02:00 
						 
				 
			
				
					
						
							
							
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							698fc0d016 
							
						 
					 
					
						
						
							
							Remove merge artefact  
						
						
						
					 
					
						2017-09-26 08:31:37 -05:00 
						 
				 
			
				
					
						
							
							
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							defb68e94f 
							
						 
					 
					
						
						
							
							Update feature/noshare with recent develop changes  
						
						
						
					 
					
						2017-09-26 08:15:14 -05:00 
						 
				 
			
				
					
						
							
							
								ines 
							
						 
					 
					
						
						
						
						
							
						
						
							edf7e4881d 
							
						 
					 
					
						
						
							
							Add meta.json option to cli.train and add relevant properties  
						
						... 
						
						
						
						Add accuracy scores to meta.json instead of accuracy.json and replace
all relevant properties like lang, pipeline, spacy_version in existing
meta.json. If not present, also add name and version placeholders to
make it packagable. 
						
					 
					
						2017-09-25 19:00:47 +02:00 
						 
				 
			
				
					
						
							
							
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							204b58c864 
							
						 
					 
					
						
						
							
							Fix evaluation during training  
						
						
						
					 
					
						2017-09-24 05:01:03 -05:00 
						 
				 
			
				
					
						
							
							
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							dc3a623d00 
							
						 
					 
					
						
						
							
							Remove unused update_shared argument  
						
						
						
					 
					
						2017-09-24 05:00:37 -05:00 
						 
				 
			
				
					
						
							
							
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							4348c479fc 
							
						 
					 
					
						
						
							
							Merge pre-trained vectors and noshare patches  
						
						
						
					 
					
						2017-09-22 20:07:28 -05:00 
						 
				 
			
				
					
						
							
							
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							e93d43a43a 
							
						 
					 
					
						
						
							
							Fix training with preset vectors  
						
						
						
					 
					
						2017-09-22 20:00:40 -05:00 
						 
				 
			
				
					
						
							
							
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							a2357cce3f 
							
						 
					 
					
						
						
							
							Set random seed in train script  
						
						
						
					 
					
						2017-09-23 02:57:31 +02:00 
						 
				 
			
				
					
						
							
							
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							0a9016cade 
							
						 
					 
					
						
						
							
							Fix serialization during training  
						
						
						
					 
					
						2017-09-21 13:06:45 -05:00 
						 
				 
			
				
					
						
							
							
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							20193371f5 
							
						 
					 
					
						
						
							
							Don't share CNN, to reduce complexities  
						
						
						
					 
					
						2017-09-21 14:59:48 +02:00 
						 
				 
			
				
					
						
							
							
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							1d73dec8b1 
							
						 
					 
					
						
						
							
							Refactor train script  
						
						
						
					 
					
						2017-09-20 19:17:10 -05:00 
						 
				 
			
				
					
						
							
							
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							a0c4b33d03 
							
						 
					 
					
						
						
							
							Support resuming a model during spacy train  
						
						
						
					 
					
						2017-09-18 18:04:47 -05:00 
						 
				 
			
				
					
						
							
							
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							8496d76224 
							
						 
					 
					
						
						
							
							Merge branch 'develop' of  https://github.com/explosion/spaCy  into develop  
						
						
						
					 
					
						2017-09-14 09:21:20 -05:00 
						 
				 
			
				
					
						
							
							
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							24ff6b0ad9 
							
						 
					 
					
						
						
							
							Fix parsing and tok2vec models  
						
						
						
					 
					
						2017-09-06 05:50:58 -05:00 
						 
				 
			
				
					
						
							
							
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							e920885676 
							
						 
					 
					
						
						
							
							Fix pickle during train  
						
						
						
					 
					
						2017-09-02 12:46:01 -05:00 
						 
				 
			
				
					
						
							
							
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							7a6edeea68 
							
						 
					 
					
						
						
							
							Merge branch 'develop' of  https://github.com/explosion/spaCy  into develop  
						
						
						
					 
					
						2017-08-20 12:55:39 -05:00 
						 
				 
			
				
					
						
							
							
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							f2f9229964 
							
						 
					 
					
						
						
							
							Fix name of update_shared flag  
						
						
						
					 
					
						2017-08-20 18:19:06 +02:00 
						 
				 
			
				
					
						
							
							
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							84bb543e4d 
							
						 
					 
					
						
						
							
							Add gold_preproc flag to cli/train  
						
						
						
					 
					
						2017-08-20 11:07:00 -05:00 
						 
				 
			
				
					
						
							
							
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							11c31d285c 
							
						 
					 
					
						
						
							
							Restore changes from nn-beam-parser  
						
						
						
					 
					
						2017-08-18 22:26:12 +02:00 
						 
				 
			
				
					
						
							
							
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							52c180ecf5 
							
						 
					 
					
						
						
							
							Revert "Merge branch 'develop' of  https://github.com/explosion/spaCy  into develop"  
						
						... 
						
						
						
						This reverts commit ea8de11ad508e443e083 
						
					 
					
						2017-08-14 13:00:23 +02:00 
						 
				 
			
				
					
						
							
							
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							8870d491f1 
							
						 
					 
					
						
						
							
							Remove redundant pickling during training  
						
						
						
					 
					
						2017-08-12 08:55:53 -05:00 
						 
				 
			
				
					
						
							
							
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							0a566dc320 
							
						 
					 
					
						
						
							
							Add update_tensors flag to Language.update. Experimental, re  #1182  
						
						
						
					 
					
						2017-08-06 02:18:12 +02:00 
						 
				 
			
				
					
						
							
							
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							c52fde40f4 
							
						 
					 
					
						
						
							
							Improve train CLI  
						
						
						
					 
					
						2017-06-04 20:18:37 -05:00 
						 
				 
			
				
					
						
							
							
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							21eef90dbc 
							
						 
					 
					
						
						
							
							Support specifying which GPU  
						
						
						
					 
					
						2017-06-03 16:10:23 -05:00 
						 
				 
			
				
					
						
							
							
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							43353b5413 
							
						 
					 
					
						
						
							
							Improve train  CLI script  
						
						
						
					 
					
						2017-06-03 13:28:20 -05:00 
						 
				 
			
				
					
						
							
							
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							8a693c2605 
							
						 
					 
					
						
						
							
							Write binary file during training  
						
						
						
					 
					
						2017-05-31 02:59:18 +02:00 
						 
				 
			
				
					
						
							
							
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							49235017bf 
							
						 
					 
					
						
						
							
							Merge branch 'develop' of  https://github.com/explosion/spaCy  into develop  
						
						
						
					 
					
						2017-05-27 16:34:28 -05:00 
						 
				 
			
				
					
						
							
							
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							5e4312feed 
							
						 
					 
					
						
						
							
							Evaluate loaded class, to ensure save/load works  
						
						
						
					 
					
						2017-05-27 15:47:02 -05:00 
						 
				 
			
				
					
						
							
							
								ines 
							
						 
					 
					
						
						
						
						
							
						
						
							086a06e7d7 
							
						 
					 
					
						
						
							
							Fix CLI docstrings and add command as first argument  
						
						... 
						
						
						
						Workaround for Plac 
						
					 
					
						2017-05-27 20:01:46 +02:00 
						 
				 
			
				
					
						
							
							
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							de13fe0305 
							
						 
					 
					
						
						
							
							Remove length cap on sentences  
						
						
						
					 
					
						2017-05-27 08:20:32 -05:00 
						 
				 
			
				
					
						
							
							
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							d65f99a720 
							
						 
					 
					
						
						
							
							Improve model saving in train script  
						
						
						
					 
					
						2017-05-26 05:52:09 -05:00 
						 
				 
			
				
					
						
							
							
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							df8015f05d 
							
						 
					 
					
						
						
							
							Tweaks to train script  
						
						
						
					 
					
						2017-05-25 17:15:24 -05:00 
						 
				 
			
				
					
						
							
							
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							702fe74a4d 
							
						 
					 
					
						
						
							
							Clean up spacy.cli.train  
						
						
						
					 
					
						2017-05-25 16:16:30 -05:00 
						 
				 
			
				
					
						
							
							
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							135a13790c 
							
						 
					 
					
						
						
							
							Disable gold preprocessing  
						
						
						
					 
					
						2017-05-24 20:10:20 -05:00 
						 
				 
			
				
					
						
							
							
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							3959d778ac 
							
						 
					 
					
						
						
							
							Revert "Revert "WIP on improving parser efficiency""  
						
						... 
						
						
						
						This reverts commit 532afef4a8 
						
					 
					
						2017-05-23 03:06:53 -05:00 
						 
				 
			
				
					
						
							
							
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							532afef4a8 
							
						 
					 
					
						
						
							
							Revert "WIP on improving parser efficiency"  
						
						... 
						
						
						
						This reverts commit bdaac7ab44 
						
					 
					
						2017-05-23 03:05:25 -05:00 
						 
				 
			
				
					
						
							
							
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							bdaac7ab44 
							
						 
					 
					
						
						
							
							WIP on improving parser efficiency  
						
						
						
					 
					
						2017-05-23 02:59:31 -05:00 
						 
				 
			
				
					
						
							
							
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							6e8dce2c05 
							
						 
					 
					
						
						
							
							Fix train command line args  
						
						
						
					 
					
						2017-05-22 10:41:39 -05:00 
						 
				 
			
				
					
						
							
							
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							ae8cf70dc1 
							
						 
					 
					
						
						
							
							Fix CLI train signature  
						
						
						
					 
					
						2017-05-22 06:13:39 -05:00 
						 
				 
			
				
					
						
							
							
								ines 
							
						 
					 
					
						
						
						
						
							
						
						
							fc3ec733ea 
							
						 
					 
					
						
						
							
							Reduce complexity in CLI  
						
						... 
						
						
						
						Remove now redundant model command and move plac annotations to cli
files 
						
					 
					
						2017-05-22 12:28:58 +02:00 
						 
				 
			
				
					
						
							
							
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							bc2294d7f1 
							
						 
					 
					
						
						
							
							Add support for fiddly hyper-parameters to train func  
						
						
						
					 
					
						2017-05-22 04:51:08 -05:00 
						 
				 
			
				
					
						
							
							
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							4e0988605a 
							
						 
					 
					
						
						
							
							Pass through non-projective=True  
						
						
						
					 
					
						2017-05-22 04:51:08 -05:00 
						 
				 
			
				
					
						
							
							
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							e14533757b 
							
						 
					 
					
						
						
							
							Use averaged params for evaluation  
						
						
						
					 
					
						2017-05-22 04:51:08 -05:00 
						 
				 
			
				
					
						
							
							
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							4c9202249d 
							
						 
					 
					
						
						
							
							Refactor training, to fix memory leak  
						
						
						
					 
					
						2017-05-21 09:07:06 -05:00 
						 
				 
			
				
					
						
							
							
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							3376d4d6e8 
							
						 
					 
					
						
						
							
							Update the train script, fixing GPU memory leak  
						
						
						
					 
					
						2017-05-19 18:15:50 -05:00 
						 
				 
			
				
					
						
							
							
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							ca70b08661 
							
						 
					 
					
						
						
							
							Fix GPU training and evaluation  
						
						
						
					 
					
						2017-05-18 08:30:33 -05:00 
						 
				 
			
				
					
						
							
							
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							fc8d3a112c 
							
						 
					 
					
						
						
							
							Add util.env_opt support: Can set hyper params through environment variables.  
						
						
						
					 
					
						2017-05-18 04:36:53 -05:00 
						 
				 
			
				
					
						
							
							
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							793430aa7a 
							
						 
					 
					
						
						
							
							Get spaCy train command working with neural network  
						
						... 
						
						
						
						* Integrate models into pipeline
* Add basic serialization (maybe incorrect)
* Fix pickle on vocab 
						
					 
					
						2017-05-17 12:04:50 +02:00 
						 
				 
			
				
					
						
							
							
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							8cf097ca88 
							
						 
					 
					
						
						
							
							Redesign training to integrate NN components  
						
						... 
						
						
						
						* Obsolete .parser, .entity etc names in favour of .pipeline
* Components no longer create models on initialization
* Models created by loading method (from_disk(), from_bytes() etc), or
    .begin_training()
* Add .predict(), .set_annotations() methods in components
* Pass state through pipeline, to allow components to share information
    more flexibly. 
						
					 
					
						2017-05-16 16:17:30 +02:00 
						 
				 
			
				
					
						
							
							
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							5211645af3 
							
						 
					 
					
						
						
							
							Get data flowing through pipeline. Needs redesign  
						
						
						
					 
					
						2017-05-16 11:21:59 +02:00 
						 
				 
			
				
					
						
							
							
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							a9edb3aa1d 
							
						 
					 
					
						
						
							
							Improve integration of NN parser, to support unified training API  
						
						
						
					 
					
						2017-05-15 21:53:27 +02:00 
						 
				 
			
				
					
						
							
							
								ines 
							
						 
					 
					
						
						
						
						
							
						
						
							59c3b9d4dd 
							
						 
					 
					
						
						
							
							Tidy up CLI and fix print functions  
						
						
						
					 
					
						2017-05-07 23:25:29 +02:00 
						 
				 
			
				
					
						
							
							
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							4f9657b42b 
							
						 
					 
					
						
						
							
							Fix reporting if no dev data with train  
						
						
						
					 
					
						2017-04-23 22:27:10 +02:00 
						 
				 
			
				
					
						
							
							
								ines 
							
						 
					 
					
						
						
						
						
							
						
						
							3a9710f356 
							
						 
					 
					
						
						
							
							Pass dev_scores to print_progress correctly ( resolves   #1008 )  
						
						... 
						
						
						
						Only read scores attribute if command is used with dev_data, otherwise
default dev_scores to empty dict. 
						
					 
					
						2017-04-23 15:58:40 +02:00 
						 
				 
			
				
					
						
							
							
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							89a4f262fc 
							
						 
					 
					
						
						
							
							Fix training methods  
						
						
						
					 
					
						2017-04-16 13:00:37 -05:00 
						 
				 
			
				
					
						
							
							
								ines 
							
						 
					 
					
						
						
						
						
							
						
						
							d24589aa72 
							
						 
					 
					
						
						
							
							Clean up imports, unused code, whitespace, docstrings  
						
						
						
					 
					
						2017-04-15 12:05:47 +02:00 
						 
				 
			
				
					
						
							
							
								ines 
							
						 
					 
					
						
						
						
						
							
						
						
							9952d3b08a 
							
						 
					 
					
						
						
							
							Fix whitespace  
						
						
						
					 
					
						2017-04-07 13:02:05 +02:00 
						 
				 
			
				
					
						
							
							
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							2efdbc08ff 
							
						 
					 
					
						
						
							
							Make training work with directories  
						
						
						
					 
					
						2017-03-26 08:46:44 -05:00 
						 
				 
			
				
					
						
							
							
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							9dcb58aaaf 
							
						 
					 
					
						
						
							
							Merge CLI changes  
						
						
						
					 
					
						2017-03-26 07:30:45 -05:00 
						 
				 
			
				
					
						
							
							
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							6b7f7a2060 
							
						 
					 
					
						
						
							
							Connect parser L1 option to train CLI  
						
						
						
					 
					
						2017-03-26 07:24:07 -05:00 
						 
				 
			
				
					
						
							
							
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							dec5571bf3 
							
						 
					 
					
						
						
							
							Update train CLI  
						
						
						
					 
					
						2017-03-26 07:16:52 -05:00 
						 
				 
			
				
					
						
							
							
								ines 
							
						 
					 
					
						
						
						
						
							
						
						
							53cf2f1c0e 
							
						 
					 
					
						
						
							
							Make dev data optional  
						
						
						
					 
					
						2017-03-26 11:48:17 +02:00 
						 
				 
			
				
					
						
							
							
								ines 
							
						 
					 
					
						
						
						
						
							
						
						
							0035fd9efe 
							
						 
					 
					
						
						
							
							Add spacy train work in progress  
						
						
						
					 
					
						2017-03-23 11:08:41 +01:00