Ines Montani 
							
						 
					 
					
						
						
						
						
							
						
						
							353f8486f5 
							
						 
					 
					
						
						
							
							Merge branch 'master' into spacy.io  
						
						
						
					 
					
						2020-03-12 14:45:33 +01:00 
						 
				 
			
				
					
						
							
							
								Ines Montani 
							
						 
					 
					
						
						
						
						
							
						
						
							1d6aec805d 
							
						 
					 
					
						
						
							
							Fix formatting and update docs for v2.2.4  
						
						
						
					 
					
						2020-03-09 11:17:20 +01:00 
						 
				 
			
				
					
						
							
							
								Ines Montani 
							
						 
					 
					
						
						
						
						
							
						
						
							5f68004264 
							
						 
					 
					
						
						
							
							Port over gitignore changes from develop  
						
						... 
						
						
						
						Prevents stale files when switching branches 
						
					 
					
						2020-03-09 11:05:00 +01:00 
						 
				 
			
				
					
						
							
							
								Mark Abraham 
							
						 
					 
					
						
						
							
							
						
						
						
							
						
						
							0345135167 
							
						 
					 
					
						
						
							
							Tokenizer to_disk and from_disk now ensure paths ( #5116 )  
						
						... 
						
						
						
						* Tokenizer to_disk and from_disk now ensure strings are converted to paths
Fixes  #5115 
* Sign contributor agreement 
						
					 
					
						2020-03-08 13:25:56 +01:00 
						 
				 
			
				
					
						
							
							
								Yohei Tamura 
							
						 
					 
					
						
						
							
							
						
						
						
							
						
						
							31755630a7 
							
						 
					 
					
						
						
							
							fix typ ( #5106 )  
						
						
						
					 
					
						2020-03-08 13:24:38 +01:00 
						 
				 
			
				
					
						
							
							
								adrianeboyd 
							
						 
					 
					
						
						
							
							
						
						
						
							
						
						
							9dd98a4b27 
							
						 
					 
					
						
						
							
							Improve Makefile ( #5105 )  
						
						... 
						
						
						
						* Explicitly upgrade pip
* Include spacy-lookups-data in pex 
						
					 
					
						2020-03-08 13:24:19 +01:00 
						 
				 
			
				
					
						
							
							
								adrianeboyd 
							
						 
					 
					
						
						
							
							
						
						
						
							
						
						
							993758c58f 
							
						 
					 
					
						
						
							
							Remove unnecessary iterator in Language.pipe ( #5101 )  
						
						... 
						
						
						
						Remove iterator over `raw_texts` with `iterator.tee()` in
`Language.pipe` that is never consumed and consumes memory
unnecessarily. 
						
					 
					
						2020-03-08 13:22:25 +01:00 
						 
				 
			
				
					
						
							
							
								Ines Montani 
							
						 
					 
					
						
						
							
							
						
						
						
							
						
						
							cd79c7bd26 
							
						 
					 
					
						
						
							
							Merge pull request  #5110  from dhpollack/dhp/fix-minor-svg-error  
						
						... 
						
						
						
						fix typo in svg file - caused documentation build error 
						
					 
					
						2020-03-06 15:32:43 +01:00 
						 
				 
			
				
					
						
							
							
								Sofie Van Landeghem 
							
						 
					 
					
						
						
							
							
						
						
						
							
						
						
							1a2b8fc264 
							
						 
					 
					
						
						
							
							set vector of merged entity ( #5085 )  
						
						... 
						
						
						
						* merge_entities sets the vector in the vocab for the merged token
* add unit test
* import unicode_literals
* move code to _merge function
* only set vector if vocab has non-zero vectors 
						
					 
					
						2020-03-06 14:45:28 +01:00 
						 
				 
			
				
					
						
							
							
								David Pollack 
							
						 
					 
					
						
						
						
						
							
						
						
							80004930ed 
							
						 
					 
					
						
						
							
							fix typo in svg file  
						
						
						
					 
					
						2020-03-05 17:04:33 +01:00 
						 
				 
			
				
					
						
							
							
								Matthew Honnibal 
							
						 
					 
					
						
						
							
							
						
						
						
							
						
						
							3440a72ecb 
							
						 
					 
					
						
						
							
							Update Makefile ( #5099 )  
						
						
						
					 
					
						2020-03-04 19:28:16 +01:00 
						 
				 
			
				
					
						
							
							
								Ines Montani 
							
						 
					 
					
						
						
							
							
						
						
						
							
						
						
							31faab3647 
							
						 
					 
					
						
						
							
							Merge pull request  #5097  from mirfan899/master  
						
						... 
						
						
						
						Basque language support added. 
						
					 
					
						2020-03-04 17:20:23 +01:00 
						 
				 
			
				
					
						
							
							
								Ines Montani 
							
						 
					 
					
						
						
							
							
						
						
						
							
						
						
							99d8ee506f 
							
						 
					 
					
						
						
							
							Merge pull request  #5100  from adrianeboyd/feature/bump-srsly-1.0.2  
						
						... 
						
						
						
						Require srsly >=1.0.2 
						
					 
					
						2020-03-04 16:32:52 +01:00 
						 
				 
			
				
					
						
							
							
								Adriane Boyd 
							
						 
					 
					
						
						
						
						
							
						
						
							4d655b1d45 
							
						 
					 
					
						
						
							
							Require srsly >=1.0.2  
						
						
						
					 
					
						2020-03-04 13:50:37 +01:00 
						 
				 
			
				
					
						
							
							
								Muhammad Irfan 
							
						 
					 
					
						
						
						
						
							
						
						
							224a7f8e94 
							
						 
					 
					
						
						
							
							examples  
						
						
						
					 
					
						2020-03-04 15:49:06 +05:00 
						 
				 
			
				
					
						
							
							
								Muhammad Irfan 
							
						 
					 
					
						
						
						
						
							
						
						
							03376c9d9b 
							
						 
					 
					
						
						
							
							Basque language added and tested.  
						
						
						
					 
					
						2020-03-04 11:58:56 +05:00 
						 
				 
			
				
					
						
							
							
								adrianeboyd 
							
						 
					 
					
						
						
							
							
						
						
						
							
						
						
							9be90dbca3 
							
						 
					 
					
						
						
							
							Improve token head verification ( #5079 )  
						
						... 
						
						
						
						* Improve token head verification
Improve the verification for valid token heads when heads are set:
* in `Token.head`: heads come from the same document
* in `Doc.from_array()`: head indices are within the bounds of the
document
* Improve error message 
						
					 
					
						2020-03-03 21:44:51 +01:00 
						 
				 
			
				
					
						
							
							
								adrianeboyd 
							
						 
					 
					
						
						
							
							
						
						
						
							
						
						
							8c20dae6f7 
							
						 
					 
					
						
						
							
							Fix model-final/model-best meta from train CLI ( #5093 )  
						
						... 
						
						
						
						* Fix model-final/model-best meta
* include speed and accuracy from final iteration
* combine with speeds from base model if necessary
* Include token_acc metric for all components 
						
					 
					
						2020-03-03 21:43:25 +01:00 
						 
				 
			
				
					
						
							
							
								Sofie Van Landeghem 
							
						 
					 
					
						
						
							
							
						
						
						
							
						
						
							a0998868ff 
							
						 
					 
					
						
						
							
							prevent updating cfg if the Model was already defined ( #5078 )  
						
						
						
					 
					
						2020-03-03 13:58:56 +01:00 
						 
				 
			
				
					
						
							
							
								Sofie Van Landeghem 
							
						 
					 
					
						
						
							
							
						
						
						
							
						
						
							d307e9ca58 
							
						 
					 
					
						
						
							
							take care of global vectors in multiprocessing ( #5081 )  
						
						... 
						
						
						
						* restore load_nlp.VECTORS in the child process
* add unit test
* fix test
* remove unnecessary import
* add utf8 encoding
* import unicode_literals 
						
					 
					
						2020-03-03 13:58:22 +01:00 
						 
				 
			
				
					
						
							
							
								adrianeboyd 
							
						 
					 
					
						
						
							
							
						
						
						
							
						
						
							d078b47c81 
							
						 
					 
					
						
						
							
							Break out of infinite loop as intended ( #5077 )  
						
						
						
					 
					
						2020-03-03 12:29:05 +01:00 
						 
				 
			
				
					
						
							
							
								adrianeboyd 
							
						 
					 
					
						
						
							
							
						
						
						
							
						
						
							697bec764d 
							
						 
					 
					
						
						
							
							Normalize IS_SENT_START to SENT_START for Matcher ( #5080 )  
						
						
						
					 
					
						2020-03-03 12:22:39 +01:00 
						 
				 
			
				
					
						
							
							
								adrianeboyd 
							
						 
					 
					
						
						
							
							
						
						
						
							
						
						
							2281c4708c 
							
						 
					 
					
						
						
							
							Restore empty tokenizer properties ( #5026 )  
						
						... 
						
						
						
						* Restore empty tokenizer properties
* Check for types in tokenizer.from_bytes()
* Add test for setting empty tokenizer rules 
						
					 
					
						2020-03-02 11:55:02 +01:00 
						 
				 
			
				
					
						
							
							
								Sofie Van Landeghem 
							
						 
					 
					
						
						
							
							
						
						
						
							
						
						
							c6b12ab02a 
							
						 
					 
					
						
						
							
							Bugfix/get doc ( #5049 )  
						
						... 
						
						
						
						* new (broken) unit test
* fixing get_doc method 
						
					 
					
						2020-03-02 11:49:28 +01:00 
						 
				 
			
				
					
						
							
							
								adrianeboyd 
							
						 
					 
					
						
						
							
							
						
						
						
							
						
						
							65d7bab10f 
							
						 
					 
					
						
						
							
							Initialize all values in a2b/b2a in new align ( #5063 )  
						
						
						
					 
					
						2020-02-27 18:43:00 +01:00 
						 
				 
			
				
					
						
							
							
								Matthew Honnibal 
							
						 
					 
					
						
						
							
							
						
						
						
							
						
						
							b4e0d2bf50 
							
						 
					 
					
						
						
							
							Improve Makefile ( #5067 )  
						
						... 
						
						
						
						* Improve pex making
* Update gitignore 
						
					 
					
						2020-02-26 20:59:10 +01:00 
						 
				 
			
				
					
						
							
							
								Ines Montani 
							
						 
					 
					
						
						
							
							
						
						
						
							
						
						
							1c212215cd 
							
						 
					 
					
						
						
							
							Merge pull request  #5064  from adrianeboyd/feature/german-tokenization  
						
						... 
						
						
						
						Improve German tokenization 
						
					 
					
						2020-02-26 13:41:44 +01:00 
						 
				 
			
				
					
						
							
							
								Ines Montani 
							
						 
					 
					
						
						
							
							
						
						
						
							
						
						
							56978f5cd8 
							
						 
					 
					
						
						
							
							Merge pull request  #5060  from svlandeg/feature/update-thinc  
						
						... 
						
						
						
						update thinc 
						
					 
					
						2020-02-26 13:40:23 +01:00 
						 
				 
			
				
					
						
							
							
								Adriane Boyd 
							
						 
					 
					
						
						
						
						
							
						
						
							d1f703d78d 
							
						 
					 
					
						
						
							
							Improve German tokenization  
						
						... 
						
						
						
						Improve German tokenization with respect to Tiger. 
						
					 
					
						2020-02-26 13:06:52 +01:00 
						 
				 
			
				
					
						
							
							
								Ines Montani 
							
						 
					 
					
						
						
						
						
							
						
						
							54da6a2a07 
							
						 
					 
					
						
						
							
							Update pyproject.toml  
						
						
						
					 
					
						2020-02-26 12:51:53 +01:00 
						 
				 
			
				
					
						
							
							
								Ines Montani 
							
						 
					 
					
						
						
						
						
							
						
						
							ed9358420e 
							
						 
					 
					
						
						
							
							Merge branch 'master' into pr/5060  
						
						
						
					 
					
						2020-02-26 12:51:29 +01:00 
						 
				 
			
				
					
						
							
							
								adrianeboyd 
							
						 
					 
					
						
						
							
							
						
						
						
							
						
						
							ff184b7a9c 
							
						 
					 
					
						
						
							
							Add tag_map argument to CLI debug-data and train ( #4750 ) ( #5038 )  
						
						... 
						
						
						
						Add an argument for a path to a JSON-formatted tag map, which is used to
update and extend the default language tag map. 
						
					 
					
						2020-02-26 12:10:38 +01:00 
						 
				 
			
				
					
						
							
							
								svlandeg 
							
						 
					 
					
						
						
						
						
							
						
						
							18ff97589d 
							
						 
					 
					
						
						
							
							update spacy to 2.2.4.dev0  
						
						
						
					 
					
						2020-02-26 10:50:05 +01:00 
						 
				 
			
				
					
						
							
							
								svlandeg 
							
						 
					 
					
						
						
						
						
							
						
						
							62406a9513 
							
						 
					 
					
						
						
							
							update from thinc 7.4.0.dev2 to 7.4.0  
						
						
						
					 
					
						2020-02-26 10:30:35 +01:00 
						 
				 
			
				
					
						
							
							
								Ines Montani 
							
						 
					 
					
						
						
							
							
						
						
						
							
						
						
							c7e3c034d2 
							
						 
					 
					
						
						
							
							Merge pull request  #5061  from explosion/fix/pyproject-toml-master  
						
						... 
						
						
						
						Update pyproject.toml 
						
					 
					
						2020-02-25 20:22:26 +01:00 
						 
				 
			
				
					
						
							
							
								Ines Montani 
							
						 
					 
					
						
						
						
						
							
						
						
							dc36ec98a4 
							
						 
					 
					
						
						
							
							Update pyproject.toml  
						
						
						
					 
					
						2020-02-25 16:46:14 +01:00 
						 
				 
			
				
					
						
							
							
								Ines Montani 
							
						 
					 
					
						
						
							
							
						
						
						
							
						
						
							acb4e3c7ba 
							
						 
					 
					
						
						
							
							Merge pull request  #5039  from adrianeboyd/typo/website-token-api-shape  
						
						... 
						
						
						
						Fix formatting in Token API 
						
					 
					
						2020-02-25 14:57:25 +01:00 
						 
				 
			
				
					
						
							
							
								Ines Montani 
							
						 
					 
					
						
						
							
							
						
						
						
							
						
						
							d50152b917 
							
						 
					 
					
						
						
							
							Merge pull request  #5019  from questoph/master  
						
						... 
						
						
						
						Optimizing tokenization for Luxembourgish (dealing with apostrophe infixes) 
						
					 
					
						2020-02-25 14:48:50 +01:00 
						 
				 
			
				
					
						
							
							
								Ines Montani 
							
						 
					 
					
						
						
							
							
						
						
						
							
						
						
							4440a072d2 
							
						 
					 
					
						
						
							
							Merge pull request  #5006  from svlandeg/bugfix/multiproc-underscore  
						
						... 
						
						
						
						load Underscore state when multiprocessing 
						
					 
					
						2020-02-25 14:46:02 +01:00 
						 
				 
			
				
					
						
							
							
								Ines Montani 
							
						 
					 
					
						
						
							
							
						
						
						
							
						
						
							38fc05986c 
							
						 
					 
					
						
						
							
							Merge pull request  #5058  from bryant1410/patch-1  
						
						... 
						
						
						
						Add missing comma in a dependency specification 
						
					 
					
						2020-02-25 14:44:29 +01:00 
						 
				 
			
				
					
						
							
							
								svlandeg 
							
						 
					 
					
						
						
						
						
							
						
						
							d848a68340 
							
						 
					 
					
						
						
							
							thinc 7.4.0.dev2  
						
						
						
					 
					
						2020-02-25 12:07:42 +01:00 
						 
				 
			
				
					
						
							
							
								Santiago Castro 
							
						 
					 
					
						
						
							
							
						
						
						
							
						
						
							54d8665ff7 
							
						 
					 
					
						
						
							
							Add missing comma in a dependency specification  
						
						... 
						
						
						
						Conda is complaining that it can't parse that line otherwise. 
						
					 
					
						2020-02-24 16:15:28 -05:00 
						 
				 
			
				
					
						
							
							
								svlandeg 
							
						 
					 
					
						
						
						
						
							
						
						
							b49a3afd0c 
							
						 
					 
					
						
						
							
							use clean_underscore fixture  
						
						
						
					 
					
						2020-02-23 15:49:20 +01:00 
						 
				 
			
				
					
						
							
							
								Ines Montani 
							
						 
					 
					
						
						
						
						
							
						
						
							d6c0746347 
							
						 
					 
					
						
						
							
							Merge branch 'master' into spacy.io  
						
						
						
					 
					
						2020-02-23 13:57:01 +01:00 
						 
				 
			
				
					
						
							
							
								Ines Montani 
							
						 
					 
					
						
						
						
						
							
						
						
							4890db6339 
							
						 
					 
					
						
						
							
							Auto-format and fix image [ci skip]  
						
						
						
					 
					
						2020-02-23 13:56:50 +01:00 
						 
				 
			
				
					
						
							
							
								Ines Montani 
							
						 
					 
					
						
						
						
						
							
						
						
							89967f3701 
							
						 
					 
					
						
						
							
							Merge branch 'master' into spacy.io  
						
						
						
					 
					
						2020-02-23 12:04:20 +01:00 
						 
				 
			
				
					
						
							
							
								Tom Keefe 
							
						 
					 
					
						
						
							
							
						
						
						
							
						
						
							ddf63b97a8 
							
						 
					 
					
						
						
							
							make idx available via to_array ( #5030 )  
						
						
						
					 
					
						2020-02-22 14:13:06 +01:00 
						 
				 
			
				
					
						
							
							
								Sofie Van Landeghem 
							
						 
					 
					
						
						
							
							
						
						
						
							
						
						
							44f4142ce4 
							
						 
					 
					
						
						
							
							add two abbreviations and some additional unit tests ( #5040 )  
						
						
						
					 
					
						2020-02-22 14:12:32 +01:00 
						 
				 
			
				
					
						
							
							
								Sofie Van Landeghem 
							
						 
					 
					
						
						
							
							
						
						
						
							
						
						
							479bd8d09f 
							
						 
					 
					
						
						
							
							add lemma option to displacy 'dep' visualiser ( #5041 )  
						
						... 
						
						
						
						* add lemma option to displacy 'dep' visualiser
* more compact list comprehension
* add option to doc
* fix test and add lemmas to util.get_doc
* fix capital
* remove lemma from get_doc
* cleanup 
						
					 
					
						2020-02-22 14:11:51 +01:00 
						 
				 
			
				
					
						
							
							
								Adriane Boyd 
							
						 
					 
					
						
						
						
						
							
						
						
							3853d385fa 
							
						 
					 
					
						
						
							
							Fix formatting in Token API  
						
						
						
					 
					
						2020-02-20 13:41:24 +01:00