svlandeg 
							
						 
					 
					
						
						
						
						
							
						
						
							2cafba5f50 
							
						 
					 
					
						
						
							
							shorten error message for clarity  
						
						
						
					 
					
						2020-10-09 12:17:35 +02:00 
						 
				 
			
				
					
						
							
							
								Ines Montani 
							
						 
					 
					
						
						
						
						
							
						
						
							4771a10503 
							
						 
					 
					
						
						
							
							Make test more explicit [ci skip]  
						
						
						
					 
					
						2020-10-09 12:15:26 +02:00 
						 
				 
			
				
					
						
							
							
								Ines Montani 
							
						 
					 
					
						
						
						
						
							
						
						
							cc3646b06c 
							
						 
					 
					
						
						
							
							Add xfailing test for peculiar spans failure [ci skip]  
						
						
						
					 
					
						2020-10-09 12:10:25 +02:00 
						 
				 
			
				
					
						
							
							
								svlandeg 
							
						 
					 
					
						
						
						
						
							
						
						
							8316bc7d4a 
							
						 
					 
					
						
						
							
							bugfix DisabledPipes  
						
						
						
					 
					
						2020-10-09 12:06:20 +02:00 
						 
				 
			
				
					
						
							
							
								svlandeg 
							
						 
					 
					
						
						
						
						
							
						
						
							18dfb27985 
							
						 
					 
					
						
						
							
							Add custom error when evaluation throws a KeyError  
						
						
						
					 
					
						2020-10-09 12:05:33 +02:00 
						 
				 
			
				
					
						
							
							
								Adriane Boyd 
							
						 
					 
					
						
						
						
						
							
						
						
							39aabf50ab 
							
						 
					 
					
						
						
							
							Also rename to include_static_vectors in CharEmbed  
						
						
						
					 
					
						2020-10-09 11:54:48 +02:00 
						 
				 
			
				
					
						
							
							
								Florijan Stamenković 
							
						 
					 
					
						
						
						
						
							
						
						
							18f5c309dc 
							
						 
					 
					
						
						
							
							Fix Issue 6207 ( #6208 )  
						
						... 
						
						
						
						* Regression test for issue 6207
* Fix issue 6207
* Sign contributor agreement
* Minor adjustments to test
Co-authored-by: Adriane Boyd <adrianeboyd@gmail.com> 
						
					 
					
						2020-10-09 10:14:40 +02:00 
						 
				 
			
				
					
						
							
							
								Duygu Altinok 
							
						 
					 
					
						
						
						
						
							
						
						
							80fb1bffc9 
							
						 
					 
					
						
						
							
							Ordinal numbers for Turkish ( #6142 )  
						
						... 
						
						
						
						* minor ordinal number addition
* fixed typo
* added corresponding lexical test 
						
					 
					
						2020-10-09 10:13:15 +02:00 
						 
				 
			
				
					
						
							
							
								Duygu Altinok 
							
						 
					 
					
						
						
						
						
							
						
						
							2fad279a44 
							
						 
					 
					
						
						
							
							Turkish language syntax iterators ( #6191 )  
						
						... 
						
						
						
						* added tr_vocab to config
* basic test
* added syntax iterator to Turkish lang class
* first version for Turkish syntax iter, without flat
* added simple tests with nmod, amod, det
* more tests to amod and nmod
* separated noun chunks and parser test
* rearrangement after nchunk parser separation
* added recursive NPs
* tests with complicated recursive NPs
* tests with conjed NPs
* additional tests for conj NP
* small modification for shaving off conj from NP
* added tests with flat
* more tests with flat
* added examples with flats conjed
* added inner func for flat trick
* corrected parse
Co-authored-by: Adriane Boyd <adrianeboyd@gmail.com> 
						
					 
					
						2020-10-09 10:10:22 +02:00 
						 
				 
			
				
					
						
							
							
								Sofie Van Landeghem 
							
						 
					 
					
						
						
							
							
						
						
						
							
						
						
							d093d6343b 
							
						 
					 
					
						
						
							
							TrainablePipe ( #6213 )  
						
						... 
						
						
						
						* rename Pipe to TrainablePipe
* split functionality between Pipe and TrainablePipe
* remove unnecessary methods from certain components
* cleanup
* hasattr(component, "pipe") should be sufficient again
* remove serialization and vocab/cfg from Pipe
* unify _ensure_examples and validate_examples
* small fixes
* hasattr checks for self.cfg and self.vocab
* make is_resizable and is_trainable properties
* serialize strings.json instead of vocab
* fix KB IO + tests
* fix typos
* more typos
* _added_strings as a set
* few more tests specifically for _added_strings field
* bump to 3.0.0a36 
						
					 
					
						2020-10-08 21:33:49 +02:00 
						 
				 
			
				
					
						
							
							
								Ines Montani 
							
						 
					 
					
						
						
						
						
							
						
						
							8ff73f04db 
							
						 
					 
					
						
						
							
							Fix morph in Doc.to_json  
						
						
						
					 
					
						2020-10-08 14:44:35 +02:00 
						 
				 
			
				
					
						
							
							
								Ines Montani 
							
						 
					 
					
						
						
							
							
						
						
						
							
						
						
							064575d79d 
							
						 
					 
					
						
						
							
							Merge pull request  #6216  from svlandeg/feature/nel-initialize  
						
						
						
					 
					
						2020-10-08 11:14:12 +02:00 
						 
				 
			
				
					
						
							
							
								svlandeg 
							
						 
					 
					
						
						
						
						
							
						
						
							3e2e1fd323 
							
						 
					 
					
						
						
							
							cleanup  
						
						
						
					 
					
						2020-10-08 10:37:32 +02:00 
						 
				 
			
				
					
						
							
							
								svlandeg 
							
						 
					 
					
						
						
						
						
							
						
						
							eaf5c265cb 
							
						 
					 
					
						
						
							
							set_kb method for entity_linker  
						
						
						
					 
					
						2020-10-08 10:34:01 +02:00 
						 
				 
			
				
					
						
							
							
								Ines Montani 
							
						 
					 
					
						
						
						
						
							
						
						
							010956d493 
							
						 
					 
					
						
						
							
							Clear rule-based components on initialize  
						
						
						
					 
					
						2020-10-08 09:51:31 +02:00 
						 
				 
			
				
					
						
							
							
								Baranitharan 
							
						 
					 
					
						
						
							
							
						
						
						
							
						
						
							d6037c1860 
							
						 
					 
					
						
						
							
							added sentence  
						
						
						
					 
					
						2020-10-08 08:22:58 +05:30 
						 
				 
			
				
					
						
							
							
								Baranitharan 
							
						 
					 
					
						
						
							
							
						
						
						
							
						
						
							81afe9b19d 
							
						 
					 
					
						
						
							
							Update examples.py  
						
						
						
					 
					
						2020-10-08 08:17:25 +05:30 
						 
				 
			
				
					
						
							
							
								Sofie Van Landeghem 
							
						 
					 
					
						
						
							
							
						
						
						
							
						
						
							241cd112f5 
							
						 
					 
					
						
						
							
							add reenabled pipe names back to the meta before serializing ( #6219 )  
						
						
						
					 
					
						2020-10-08 00:44:16 +02:00 
						 
				 
			
				
					
						
							
							
								Sofie Van Landeghem 
							
						 
					 
					
						
						
							
							
						
						
						
							
						
						
							2998131416 
							
						 
					 
					
						
						
							
							Reproducibility for TextCat and Tok2Vec ( #6218 )  
						
						... 
						
						
						
						* ensure fixed seed in HashEmbed layers
* forgot about the joys of python 2 
						
					 
					
						2020-10-08 00:43:46 +02:00 
						 
				 
			
				
					
						
							
							
								svlandeg 
							
						 
					 
					
						
						
						
						
							
						
						
							efedccea8d 
							
						 
					 
					
						
						
							
							fix tests  
						
						
						
					 
					
						2020-10-07 15:29:52 +02:00 
						 
				 
			
				
					
						
							
							
								svlandeg 
							
						 
					 
					
						
						
						
						
							
						
						
							6b8bdb2d39 
							
						 
					 
					
						
						
							
							add init_config to nlp.create_pipe  
						
						
						
					 
					
						2020-10-07 14:58:16 +02:00 
						 
				 
			
				
					
						
							
							
								svlandeg 
							
						 
					 
					
						
						
						
						
							
						
						
							33c2d4af16 
							
						 
					 
					
						
						
							
							move kb_loader to initialize for NEL instead of constructor  
						
						
						
					 
					
						2020-10-07 14:56:00 +02:00 
						 
				 
			
				
					
						
							
							
								Wannaphong Phatthiyaphaibun 
							
						 
					 
					
						
						
							
							
						
						
						
							
						
						
							9fc8392b38 
							
						 
					 
					
						
						
							
							Add Thai tag map (LST20 Corpus) ( #6163 )  
						
						... 
						
						
						
						* Add Thai tag map (LST20 Corpus)
By @korakot
* Update tag_map.py
* Update tag_map.py
* Update tag_map.py 
						
					 
					
						2020-10-07 11:12:01 +02:00 
						 
				 
			
				
					
						
							
							
								Duygu Altinok 
							
						 
					 
					
						
						
							
							
						
						
						
							
						
						
							7e821c2776 
							
						 
					 
					
						
						
							
							Turkish language syntax iterators ( #6191 )  
						
						... 
						
						
						
						* added tr_vocab to config
* basic test
* added syntax iterator to Turkish lang class
* first version for Turkish syntax iter, without flat
* added simple tests with nmod, amod, det
* more tests to amod and nmod
* separated noun chunks and parser test
* rearrangement after nchunk parser separation
* added recursive NPs
* tests with complicated recursive NPs
* tests with conjed NPs
* additional tests for conj NP
* small modification for shaving off conj from NP
* added tests with flat
* more tests with flat
* added examples with flats conjed
* added inner func for flat trick
* corrected parse
Co-authored-by: Adriane Boyd <adrianeboyd@gmail.com> 
						
					 
					
						2020-10-07 11:07:52 +02:00 
						 
				 
			
				
					
						
							
							
								Duygu Altinok 
							
						 
					 
					
						
						
							
							
						
						
						
							
						
						
							2ce6fc2611 
							
						 
					 
					
						
						
							
							Turkish tag map and morph rules addition ( #6141 )  
						
						... 
						
						
						
						* feat: added turkish tag map
* feat: morph rules cconj and sconj
* feat: more conjuncts
* feat: added popular postpositions
* feat: added adverbs
* feat: added personal pronouns
* feat: added reflexive pronouns
* minor: corrected case capital
* minor: fixed comma typo
* feat: added indef pronouns
* feat: added dict iter
* fixed comma typo
* updated language class with tag map and morph
* use default tag map instead
* removed tag map 
						
					 
					
						2020-10-07 10:27:36 +02:00 
						 
				 
			
				
					
						
							
							
								Duygu Altinok 
							
						 
					 
					
						
						
							
							
						
						
						
							
						
						
							b95a11dd95 
							
						 
					 
					
						
						
							
							Ordinal numbers for Turkish ( #6142 )  
						
						... 
						
						
						
						* minor ordinal number addition
* fixed typo
* added corresponding lexical test 
						
					 
					
						2020-10-07 10:25:37 +02:00 
						 
				 
			
				
					
						
							
							
								Rahul Gupta 
							
						 
					 
					
						
						
							
							
						
						
						
							
						
						
							1a00bff06d 
							
						 
					 
					
						
						
							
							Hindi: Adds tests for lexical attributes (norm and like_num) ( #5829 )  
						
						... 
						
						
						
						* Hindi: Adds tests for lexical attributes (norm and like_num)
* Signs and sdds the contributor agreement
* Add ordinal numbers to be tagged as like_num
* Adds alternate pronunciation for 31 and 39 
						
					 
					
						2020-10-07 10:23:32 +02:00 
						 
				 
			
				
					
						
							
							
								Nuccy90 
							
						 
					 
					
						
						
							
							
						
						
						
							
						
						
							c809b2c8e7 
							
						 
					 
					
						
						
							
							Update morph_rules.py ( #6102 )  
						
						... 
						
						
						
						* Update morph_rules.py
Added "dig" and "dej" ("you" in accusative form)
* Create Nuccy90.md
* Update Nuccy90.md 
						
					 
					
						2020-10-06 15:14:47 +02:00 
						 
				 
			
				
					
						
							
							
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							1a500f9717 
							
						 
					 
					
						
						
							
							Set version to v3.0.0a35  
						
						
						
					 
					
						2020-10-06 14:19:07 +02:00 
						 
				 
			
				
					
						
							
							
								Sofie Van Landeghem 
							
						 
					 
					
						
						
							
							
						
						
						
							
						
						
							fff3f8ccfa 
							
						 
					 
					
						
						
							
							Fix packaging pin ( #6212 )  
						
						... 
						
						
						
						* pin packaging to >=20.0
* ignore spacy-pkuseg in requirements unit test 
						
					 
					
						2020-10-06 14:16:05 +02:00 
						 
				 
			
				
					
						
							
							
								Matthew Honnibal 
							
						 
					 
					
						
						
							
							
						
						
						
							
						
						
							cfb9770a94 
							
						 
					 
					
						
						
							
							Fix empty input into StaticVectors layer ( #6211 )  
						
						... 
						
						
						
						* Add test for empty doc(s)
* Fix empty check in staticvectors
* Remove xfail
* Update spacy/ml/staticvectors.py 
						
					 
					
						2020-10-06 14:15:41 +02:00 
						 
				 
			
				
					
						
							
							
								Florijan Stamenković 
							
						 
					 
					
						
						
							
							
						
						
						
							
						
						
							9db670b996 
							
						 
					 
					
						
						
							
							Fix Issue 6207 ( #6208 )  
						
						... 
						
						
						
						* Regression test for issue 6207
* Fix issue 6207
* Sign contributor agreement
* Minor adjustments to test
Co-authored-by: Adriane Boyd <adrianeboyd@gmail.com> 
						
					 
					
						2020-10-06 11:17:37 +02:00 
						 
				 
			
				
					
						
							
							
								Ines Montani 
							
						 
					 
					
						
						
							
							
						
						
						
							
						
						
							568e12215d 
							
						 
					 
					
						
						
							
							Merge pull request  #6206  from svlandeg/fix/patterns-init  
						
						
						
					 
					
						2020-10-06 10:27:23 +02:00 
						 
				 
			
				
					
						
							
							
								svlandeg 
							
						 
					 
					
						
						
						
						
							
						
						
							9b4cf7b0b6 
							
						 
					 
					
						
						
							
							update output of debug config command  
						
						
						
					 
					
						2020-10-06 09:47:23 +02:00 
						 
				 
			
				
					
						
							
							
								svlandeg 
							
						 
					 
					
						
						
						
						
							
						
						
							ff9ac39c88 
							
						 
					 
					
						
						
							
							read entity_ruler patterns with srsly.read_jsonl.v1  
						
						
						
					 
					
						2020-10-05 22:50:14 +02:00 
						 
				 
			
				
					
						
							
							
								Ines Montani 
							
						 
					 
					
						
						
						
						
							
						
						
							126268ce50 
							
						 
					 
					
						
						
							
							Auto-format [ci skip]  
						
						
						
					 
					
						2020-10-05 21:58:18 +02:00 
						 
				 
			
				
					
						
							
							
								Ines Montani 
							
						 
					 
					
						
						
						
						
							
						
						
							1a554bdcb1 
							
						 
					 
					
						
						
							
							Update docs and docstring [ci skip]  
						
						
						
					 
					
						2020-10-05 21:55:27 +02:00 
						 
				 
			
				
					
						
							
							
								Ines Montani 
							
						 
					 
					
						
						
						
						
							
						
						
							9614e53b02 
							
						 
					 
					
						
						
							
							Tidy up and auto-format  
						
						
						
					 
					
						2020-10-05 21:55:18 +02:00 
						 
				 
			
				
					
						
							
							
								Ines Montani 
							
						 
					 
					
						
						
							
							
						
						
						
							
						
						
							181039bd17 
							
						 
					 
					
						
						
							
							Merge pull request  #6205  from explosion/feature/embed-features  
						
						
						
					 
					
						2020-10-05 21:49:10 +02:00 
						 
				 
			
				
					
						
							
							
								Ines Montani 
							
						 
					 
					
						
						
						
						
							
						
						
							5ba418b08c 
							
						 
					 
					
						
						
							
							Merge branch 'develop' of  https://github.com/explosion/spaCy  into develop  
						
						
						
					 
					
						2020-10-05 21:44:01 +02:00 
						 
				 
			
				
					
						
							
							
								Ines Montani 
							
						 
					 
					
						
						
							
							
						
						
						
							
						
						
							568617af58 
							
						 
					 
					
						
						
							
							Merge pull request  #6202  from explosion/feature/project-spacy-version  
						
						
						
					 
					
						2020-10-05 21:40:52 +02:00 
						 
				 
			
				
					
						
							
							
								Ines Montani 
							
						 
					 
					
						
						
						
						
							
						
						
							2d0c0134bc 
							
						 
					 
					
						
						
							
							Adjust message [ci skip]  
						
						
						
					 
					
						2020-10-05 21:38:23 +02:00 
						 
				 
			
				
					
						
							
							
								Ines Montani 
							
						 
					 
					
						
						
							
							
						
						
						
							
						
						
							6abfc2911d 
							
						 
					 
					
						
						
							
							Merge pull request  #6203  from adrianeboyd/feature/zh-spacy-pkuseg  
						
						
						
					 
					
						2020-10-05 21:35:57 +02:00 
						 
				 
			
				
					
						
							
							
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							b7e01d2024 
							
						 
					 
					
						
						
							
							Fix quickstart  
						
						
						
					 
					
						2020-10-05 21:21:30 +02:00 
						 
				 
			
				
					
						
							
							
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							ff8b980775 
							
						 
					 
					
						
						
							
							Upd quickstart template  
						
						
						
					 
					
						2020-10-05 21:19:41 +02:00 
						 
				 
			
				
					
						
							
							
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							91d0fbb588 
							
						 
					 
					
						
						
							
							Fix test  
						
						
						
					 
					
						2020-10-05 21:13:53 +02:00 
						 
				 
			
				
					
						
							
							
								Ines Montani 
							
						 
					 
					
						
						
						
						
							
						
						
							9ca283a899 
							
						 
					 
					
						
						
							
							Merge branch 'develop' into feature/project-spacy-version  
						
						
						
					 
					
						2020-10-05 21:06:07 +02:00 
						 
				 
			
				
					
						
							
							
								Ines Montani 
							
						 
					 
					
						
						
						
						
							
						
						
							0135f6ed95 
							
						 
					 
					
						
						
							
							Enable commit check via env var  
						
						
						
					 
					
						2020-10-05 20:51:15 +02:00 
						 
				 
			
				
					
						
							
							
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							b392d48e76 
							
						 
					 
					
						
						
							
							Fix test  
						
						
						
					 
					
						2020-10-05 20:17:07 +02:00 
						 
				 
			
				
					
						
							
							
								Ines Montani 
							
						 
					 
					
						
						
							
							
						
						
						
							
						
						
							be99f1e4de 
							
						 
					 
					
						
						
							
							Remove output dirs before training ( #6204 )  
						
						... 
						
						
						
						* Remove output dirs before training
* Re-raise error if cleaning fails 
						
					 
					
						2020-10-05 20:11:16 +02:00 
						 
				 
			
				
					
						
							
							
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							e50047f1c5 
							
						 
					 
					
						
						
							
							Check lengths match  
						
						
						
					 
					
						2020-10-05 20:02:45 +02:00 
						 
				 
			
				
					
						
							
							
								Ines Montani 
							
						 
					 
					
						
						
						
						
							
						
						
							582701519e 
							
						 
					 
					
						
						
							
							Remove __release__ flag  
						
						
						
					 
					
						2020-10-05 20:00:49 +02:00 
						 
				 
			
				
					
						
							
							
								Ines Montani 
							
						 
					 
					
						
						
						
						
							
						
						
							d58fb42707 
							
						 
					 
					
						
						
							
							Add spacy_version option and validation for project.yml  
						
						
						
					 
					
						2020-10-05 20:00:42 +02:00 
						 
				 
			
				
					
						
							
							
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							db84d175c3 
							
						 
					 
					
						
						
							
							Fix test  
						
						
						
					 
					
						2020-10-05 19:59:30 +02:00 
						 
				 
			
				
					
						
							
							
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							cdd2b79b6d 
							
						 
					 
					
						
						
							
							Remove deprecated MultiHashEmbed  
						
						
						
					 
					
						2020-10-05 19:58:18 +02:00 
						 
				 
			
				
					
						
							
							
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							6dcc4a0ba6 
							
						 
					 
					
						
						
							
							Simplify MultiHashEmbed signature  
						
						
						
					 
					
						2020-10-05 19:57:45 +02:00 
						 
				 
			
				
					
						
							
							
								svlandeg 
							
						 
					 
					
						
						
						
						
							
						
						
							193e0d5a98 
							
						 
					 
					
						
						
							
							add docs for entity_ruler.initialize  
						
						
						
					 
					
						2020-10-05 18:04:08 +02:00 
						 
				 
			
				
					
						
							
							
								svlandeg 
							
						 
					 
					
						
						
						
						
							
						
						
							3ac3447eee 
							
						 
					 
					
						
						
							
							cleanup  
						
						
						
					 
					
						2020-10-05 17:50:37 +02:00 
						 
				 
			
				
					
						
							
							
								svlandeg 
							
						 
					 
					
						
						
						
						
							
						
						
							9eb813a35d 
							
						 
					 
					
						
						
							
							Merge remote-tracking branch 'upstream/develop' into fix/patterns-init  
						
						
						
					 
					
						2020-10-05 17:49:44 +02:00 
						 
				 
			
				
					
						
							
							
								Adriane Boyd 
							
						 
					 
					
						
						
						
						
							
						
						
							f102ef6b54 
							
						 
					 
					
						
						
							
							Read features.msgpack instead of features.pkl  
						
						
						
					 
					
						2020-10-05 17:47:39 +02:00 
						 
				 
			
				
					
						
							
							
								svlandeg 
							
						 
					 
					
						
						
						
						
							
						
						
							4e3ace4b8c 
							
						 
					 
					
						
						
							
							is_trainable method  
						
						
						
					 
					
						2020-10-05 17:43:42 +02:00 
						 
				 
			
				
					
						
							
							
								Ines Montani 
							
						 
					 
					
						
						
							
							
						
						
						
							
						
						
							84fedcebab 
							
						 
					 
					
						
						
							
							Make args keyword-only [ci skip]  
						
						... 
						
						
						
						Co-authored-by: Matthew Honnibal <honnibal+gh@gmail.com> 
						
					 
					
						2020-10-05 17:07:35 +02:00 
						 
				 
			
				
					
						
							
							
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							71e73ed0a6 
							
						 
					 
					
						
						
							
							Merge branch 'develop' into feature/embed-features  
						
						
						
					 
					
						2020-10-05 17:00:05 +02:00 
						 
				 
			
				
					
						
							
							
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							3ee3649b52 
							
						 
					 
					
						
						
							
							Fix augment  
						
						
						
					 
					
						2020-10-05 16:59:49 +02:00 
						 
				 
			
				
					
						
							
							
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							22937d25a9 
							
						 
					 
					
						
						
							
							Merge branch 'develop' into feature/embed-features  
						
						
						
					 
					
						2020-10-05 16:42:17 +02:00 
						 
				 
			
				
					
						
							
							
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							8deed614e9 
							
						 
					 
					
						
						
							
							Fix augment  
						
						
						
					 
					
						2020-10-05 16:41:45 +02:00 
						 
				 
			
				
					
						
							
							
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							4ed3e037df 
							
						 
					 
					
						
						
							
							Fix augment  
						
						
						
					 
					
						2020-10-05 16:40:55 +02:00 
						 
				 
			
				
					
						
							
							
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							9f1bc3f24c 
							
						 
					 
					
						
						
							
							Fix augment  
						
						
						
					 
					
						2020-10-05 16:40:23 +02:00 
						 
				 
			
				
					
						
							
							
								svlandeg 
							
						 
					 
					
						
						
						
						
							
						
						
							dc06912c76 
							
						 
					 
					
						
						
							
							prevent loss keyerror for non-trainable components  
						
						
						
					 
					
						2020-10-05 16:33:28 +02:00 
						 
				 
			
				
					
						
							
							
								Adriane Boyd 
							
						 
					 
					
						
						
						
						
							
						
						
							187234648c 
							
						 
					 
					
						
						
							
							Revert back to "default" as default for pkuseg_user_dict  
						
						
						
					 
					
						2020-10-05 16:24:28 +02:00 
						 
				 
			
				
					
						
							
							
								svlandeg 
							
						 
					 
					
						
						
						
						
							
						
						
							65abd77779 
							
						 
					 
					
						
						
							
							add finish_update to Pipe  
						
						
						
					 
					
						2020-10-05 16:23:33 +02:00 
						 
				 
			
				
					
						
							
							
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							90040aacec 
							
						 
					 
					
						
						
							
							Fix merge  
						
						
						
					 
					
						2020-10-05 16:12:01 +02:00 
						 
				 
			
				
					
						
							
							
								Matthew Honnibal 
							
						 
					 
					
						
						
							
							
						
						
						
							
						
						
							93a98e8c3e 
							
						 
					 
					
						
						
							
							Merge branch 'develop' into feature/embed-features  
						
						
						
					 
					
						2020-10-05 15:51:31 +02:00 
						 
				 
			
				
					
						
							
							
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							eb9ba61517 
							
						 
					 
					
						
						
							
							Format  
						
						
						
					 
					
						2020-10-05 15:29:49 +02:00 
						 
				 
			
				
					
						
							
							
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							7d93575f35 
							
						 
					 
					
						
						
							
							spacy/tests/  
						
						
						
					 
					
						2020-10-05 15:28:12 +02:00 
						 
				 
			
				
					
						
							
							
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							f4ca9a39cb 
							
						 
					 
					
						
						
							
							spacy/tests/  
						
						
						
					 
					
						2020-10-05 15:27:06 +02:00 
						 
				 
			
				
					
						
							
							
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							f2f1deca66 
							
						 
					 
					
						
						
							
							spacy/tests/  
						
						
						
					 
					
						2020-10-05 15:24:33 +02:00 
						 
				 
			
				
					
						
							
							
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							8ec79ad3fa 
							
						 
					 
					
						
						
							
							Allow configuration of MultiHashEmbed features  
						
						... 
						
						
						
						Update arguments to MultiHashEmbed layer so that the attributes can be
controlled. A kind of tricky scheme is used to allow optional
specification of the rows. I think it's an okay balance between
flexibility and convenience. 
						
					 
					
						2020-10-05 15:22:00 +02:00 
						 
				 
			
				
					
						
							
							
								Ines Montani 
							
						 
					 
					
						
						
							
							
						
						
						
							
						
						
							7946fd84bb 
							
						 
					 
					
						
						
							
							Merge pull request  #6200  from adrianeboyd/bugfix/vocab-disk-lookups-vectors  
						
						... 
						
						
						
						Always serialize lookups and vectors to disk 
						
					 
					
						2020-10-05 15:15:25 +02:00 
						 
				 
			
				
					
						
							
							
								Ines Montani 
							
						 
					 
					
						
						
						
						
							
						
						
							8171e28b20 
							
						 
					 
					
						
						
							
							Remove logging [ci skip]  
						
						... 
						
						
						
						This would be fired on each example, which is wrong 
						
					 
					
						2020-10-05 15:09:52 +02:00 
						 
				 
			
				
					
						
							
							
								svlandeg 
							
						 
					 
					
						
						
						
						
							
						
						
							251b3eb4e5 
							
						 
					 
					
						
						
							
							add initialize method for entity_ruler  
						
						
						
					 
					
						2020-10-05 14:59:13 +02:00 
						 
				 
			
				
					
						
							
							
								Sofie Van Landeghem 
							
						 
					 
					
						
						
							
							
						
						
						
							
						
						
							f4f49f5877 
							
						 
					 
					
						
						
							
							update blis ( #6198 )  
						
						... 
						
						
						
						* allow higher blis version
* fix typo
* bump to 3.0.0a34
* fix pins in other files 
						
					 
					
						2020-10-05 14:58:56 +02:00 
						 
				 
			
				
					
						
							
							
								Adriane Boyd 
							
						 
					 
					
						
						
						
						
							
						
						
							5d19dfc9d3 
							
						 
					 
					
						
						
							
							Update Chinese tokenizer for spacy-pkuseg fork  
						
						
						
					 
					
						2020-10-05 14:21:53 +02:00 
						 
				 
			
				
					
						
							
							
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							6a9d14e35a 
							
						 
					 
					
						
						
							
							Merge branch 'develop' of  https://github.com/explosion/spaCy  into develop  
						
						
						
					 
					
						2020-10-05 14:17:41 +02:00 
						 
				 
			
				
					
						
							
							
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							d2b9aafb8c 
							
						 
					 
					
						
						
							
							Fix augmenter  
						
						
						
					 
					
						2020-10-05 14:14:49 +02:00 
						 
				 
			
				
					
						
							
							
								Ines Montani 
							
						 
					 
					
						
						
							
							
						
						
						
							
						
						
							6260fa3c10 
							
						 
					 
					
						
						
							
							Merge pull request  #6201  from svlandeg/fix/error_nr  
						
						
						
					 
					
						2020-10-05 14:00:57 +02:00 
						 
				 
			
				
					
						
							
							
								Ines Montani 
							
						 
					 
					
						
						
						
						
							
						
						
							6958510bda 
							
						 
					 
					
						
						
							
							Include spaCy version check in project CLI  
						
						
						
					 
					
						2020-10-05 13:53:07 +02:00 
						 
				 
			
				
					
						
							
							
								Ines Montani 
							
						 
					 
					
						
						
						
						
							
						
						
							20f2a17a09 
							
						 
					 
					
						
						
							
							Merge test_misc and test_util  
						
						
						
					 
					
						2020-10-05 13:45:57 +02:00 
						 
				 
			
				
					
						
							
							
								svlandeg 
							
						 
					 
					
						
						
						
						
							
						
						
							fd2d48556c 
							
						 
					 
					
						
						
							
							fix E902 and E903 numbering  
						
						
						
					 
					
						2020-10-05 13:43:32 +02:00 
						 
				 
			
				
					
						
							
							
								Ines Montani 
							
						 
					 
					
						
						
						
						
							
						
						
							1c641e41c3 
							
						 
					 
					
						
						
							
							Remove unused import [ci skip]  
						
						
						
					 
					
						2020-10-05 11:50:11 +02:00 
						 
				 
			
				
					
						
							
							
								Adriane Boyd 
							
						 
					 
					
						
						
						
						
							
						
						
							03cfb2d2f4 
							
						 
					 
					
						
						
							
							Always serialize lookups and vectors to disk  
						
						
						
					 
					
						2020-10-05 09:40:20 +02:00 
						 
				 
			
				
					
						
							
							
								Adriane Boyd 
							
						 
					 
					
						
						
						
						
							
						
						
							b0b93854cb 
							
						 
					 
					
						
						
							
							Update ru/uk lemmatizers for new nlp.initialize  
						
						
						
					 
					
						2020-10-05 09:27:16 +02:00 
						 
				 
			
				
					
						
							
							
								Ines Montani 
							
						 
					 
					
						
						
						
						
							
						
						
							549758f67d 
							
						 
					 
					
						
						
							
							Adjust test for now  
						
						
						
					 
					
						2020-10-04 23:16:09 +02:00 
						 
				 
			
				
					
						
							
							
								Ines Montani 
							
						 
					 
					
						
						
						
						
							
						
						
							4b15ff7504 
							
						 
					 
					
						
						
							
							Increment version [ci skip]  
						
						
						
					 
					
						2020-10-04 22:47:04 +02:00 
						 
				 
			
				
					
						
							
							
								Ines Montani 
							
						 
					 
					
						
						
						
						
							
						
						
							f1d1f78636 
							
						 
					 
					
						
						
							
							Make warning debug log [ci skip]  
						
						
						
					 
					
						2020-10-04 22:44:21 +02:00 
						 
				 
			
				
					
						
							
							
								Ines Montani 
							
						 
					 
					
						
						
							
							
						
						
						
							
						
						
							3c36a57e84 
							
						 
					 
					
						
						
							
							Update data augmenters ( #6196 )  
						
						... 
						
						
						
						* Draft lower-case augmenter
* Make warning a debug log
* Update lowercase augmenter, docs and tests
Co-authored-by: Matthew Honnibal <honnibal+gh@gmail.com> 
						
					 
					
						2020-10-04 17:46:29 +02:00 
						 
				 
			
				
					
						
							
							
								Ines Montani 
							
						 
					 
					
						
						
						
						
							
						
						
							d38dc466c5 
							
						 
					 
					
						
						
							
							Adjust error [ci skip]  
						
						
						
					 
					
						2020-10-04 15:26:01 +02:00 
						 
				 
			
				
					
						
							
							
								Ines Montani 
							
						 
					 
					
						
						
							
							
						
						
						
							
						
						
							496228771d 
							
						 
					 
					
						
						
							
							Merge pull request  #6194  from explosion/master-tmp  
						
						
						
					 
					
						2020-10-04 15:25:41 +02:00 
						 
				 
			
				
					
						
							
							
								Ines Montani 
							
						 
					 
					
						
						
							
							
						
						
						
							
						
						
							0307a228c8 
							
						 
					 
					
						
						
							
							Merge pull request  #6193  from explosion/fix/adjust-pipe-init  
						
						... 
						
						
						
						Adjust [initialize.components] on Language.remove_pipe and Language.rename_pipe 
						
					 
					
						2020-10-04 15:20:54 +02:00 
						 
				 
			
				
					
						
							
							
								Ines Montani 
							
						 
					 
					
						
						
						
						
							
						
						
							59deeb7da6 
							
						 
					 
					
						
						
							
							Merge branch 'develop' into master-tmp  
						
						
						
					 
					
						2020-10-04 14:52:20 +02:00 
						 
				 
			
				
					
						
							
							
								Ines Montani 
							
						 
					 
					
						
						
							
							
						
						
						
							
						
						
							43d7652635 
							
						 
					 
					
						
						
							
							Merge pull request  #6192  from explosion/feature/init-attr-ruler  
						
						
						
					 
					
						2020-10-04 14:46:37 +02:00 
						 
				 
			
				
					
						
							
							
								Ines Montani 
							
						 
					 
					
						
						
						
						
							
						
						
							8f018e47f8 
							
						 
					 
					
						
						
							
							Adjust [initialize.components] on Language.remove_pipe and Language.rename_pipe  
						
						
						
					 
					
						2020-10-04 14:43:45 +02:00 
						 
				 
			
				
					
						
							
							
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							84ae197dd6 
							
						 
					 
					
						
						
							
							Fix logger  
						
						
						
					 
					
						2020-10-04 14:16:53 +02:00 
						 
				 
			
				
					
						
							
							
								Ines Montani 
							
						 
					 
					
						
						
						
						
							
						
						
							11347f34da 
							
						 
					 
					
						
						
							
							Tidy up, tests and docs  
						
						
						
					 
					
						2020-10-04 13:54:05 +02:00 
						 
				 
			
				
					
						
							
							
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							96b636c2d3 
							
						 
					 
					
						
						
							
							Update attribute ruler  
						
						
						
					 
					
						2020-10-04 13:08:21 +02:00 
						 
				 
			
				
					
						
							
							
								Ines Montani 
							
						 
					 
					
						
						
						
						
							
						
						
							bcd52e5486 
							
						 
					 
					
						
						
							
							Tidy up errors and warnings  
						
						
						
					 
					
						2020-10-04 11:16:31 +02:00 
						 
				 
			
				
					
						
							
							
								Ines Montani 
							
						 
					 
					
						
						
						
						
							
						
						
							ff914f4e6f 
							
						 
					 
					
						
						
							
							Lazy-load xx  
						
						
						
					 
					
						2020-10-04 11:10:26 +02:00 
						 
				 
			
				
					
						
							
							
								Ines Montani 
							
						 
					 
					
						
						
						
						
							
						
						
							d3b3663942 
							
						 
					 
					
						
						
							
							Adjust error message and add test  
						
						
						
					 
					
						2020-10-04 10:11:27 +02:00 
						 
				 
			
				
					
						
							
							
								Ines Montani 
							
						 
					 
					
						
						
						
						
							
						
						
							2110e8f86d 
							
						 
					 
					
						
						
							
							Auto-format  
						
						
						
					 
					
						2020-10-04 10:06:49 +02:00 
						 
				 
			
				
					
						
							
							
								Ines Montani 
							
						 
					 
					
						
						
							
							
						
						
						
							
						
						
							cc08c88a89 
							
						 
					 
					
						
						
							
							Merge pull request  #6187  from svlandeg/fix/begin_training_pipe  
						
						
						
					 
					
						2020-10-04 10:01:02 +02:00 
						 
				 
			
				
					
						
							
							
								svlandeg 
							
						 
					 
					
						
						
						
						
							
						
						
							3f657ed3a1 
							
						 
					 
					
						
						
							
							implement warning in __init_subclass__ instead  
						
						
						
					 
					
						2020-10-03 22:34:10 +02:00 
						 
				 
			
				
					
						
							
							
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							3b2a78720c 
							
						 
					 
					
						
						
							
							Upd morphologizer  
						
						
						
					 
					
						2020-10-03 19:35:19 +02:00 
						 
				 
			
				
					
						
							
							
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							835070cedc 
							
						 
					 
					
						
						
							
							Upd test  
						
						
						
					 
					
						2020-10-03 19:35:10 +02:00 
						 
				 
			
				
					
						
							
							
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							70b9de8e58 
							
						 
					 
					
						
						
							
							Set version to v3.0.0a32  
						
						
						
					 
					
						2020-10-03 19:26:52 +02:00 
						 
				 
			
				
					
						
							
							
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							85ede32680 
							
						 
					 
					
						
						
							
							Format  
						
						
						
					 
					
						2020-10-03 19:26:23 +02:00 
						 
				 
			
				
					
						
							
							
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							b305f2ff5a 
							
						 
					 
					
						
						
							
							Fix loggers  
						
						
						
					 
					
						2020-10-03 19:26:10 +02:00 
						 
				 
			
				
					
						
							
							
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							4fccd2ceaf 
							
						 
					 
					
						
						
							
							Merge branch 'develop' of  https://github.com/explosion/spaCy  into develop  
						
						
						
					 
					
						2020-10-03 19:13:55 +02:00 
						 
				 
			
				
					
						
							
							
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							8ea8b7d940 
							
						 
					 
					
						
						
							
							Support loading labels in morphologizer  
						
						
						
					 
					
						2020-10-03 19:13:42 +02:00 
						 
				 
			
				
					
						
							
							
								Ines Montani 
							
						 
					 
					
						
						
						
						
							
						
						
							c2401fca41 
							
						 
					 
					
						
						
							
							Add tests for Pipe.label_data  
						
						
						
					 
					
						2020-10-03 19:12:46 +02:00 
						 
				 
			
				
					
						
							
							
								Ines Montani 
							
						 
					 
					
						
						
						
						
							
						
						
							80603f0fa5 
							
						 
					 
					
						
						
							
							Make SentenceRecognizer.label_data return None  
						
						... 
						
						
						
						Overwrite the method from the base class (Tagger) but don't export anything in "init labels" 
						
					 
					
						2020-10-03 18:54:09 +02:00 
						 
				 
			
				
					
						
							
							
								Ines Montani 
							
						 
					 
					
						
						
						
						
							
						
						
							d6c967401f 
							
						 
					 
					
						
						
							
							Increment version  
						
						
						
					 
					
						2020-10-03 17:20:47 +02:00 
						 
				 
			
				
					
						
							
							
								Ines Montani 
							
						 
					 
					
						
						
						
						
							
						
						
							3bc3c05fcc 
							
						 
					 
					
						
						
							
							Tidy up and auto-format  
						
						
						
					 
					
						2020-10-03 17:20:18 +02:00 
						 
				 
			
				
					
						
							
							
								Ines Montani 
							
						 
					 
					
						
						
						
						
							
						
						
							7c4ab7e82c 
							
						 
					 
					
						
						
							
							Fix Lemmatizer.get_lookups_config  
						
						
						
					 
					
						2020-10-03 17:16:10 +02:00 
						 
				 
			
				
					
						
							
							
								Ines Montani 
							
						 
					 
					
						
						
							
							
						
						
						
							
						
						
							dd542ec6a4 
							
						 
					 
					
						
						
							
							Fix label initialization of textcat component ( #6190 )  
						
						
						
					 
					
						2020-10-03 17:07:38 +02:00 
						 
				 
			
				
					
						
							
							
								Ines Montani 
							
						 
					 
					
						
						
						
						
							
						
						
							989a96308f 
							
						 
					 
					
						
						
							
							Tidy up, auto-format, types  
						
						
						
					 
					
						2020-10-03 16:31:58 +02:00 
						 
				 
			
				
					
						
							
							
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							7b127f307e 
							
						 
					 
					
						
						
							
							Set version to v3.0.0a30  
						
						
						
					 
					
						2020-10-03 16:06:42 +02:00 
						 
				 
			
				
					
						
							
							
								Matthew Honnibal 
							
						 
					 
					
						
						
							
							
						
						
						
							
						
						
							db419f6b2f 
							
						 
					 
					
						
						
							
							Improve control of training progress and logging ( #6184 )  
						
						... 
						
						
						
						* Make logging and progress easier to control
* Update docs
* Cleanup errors
* Fix ConfigValidationError
* Pass stdout/stderr, not wasabi.Printer
* Fix type
* Upd logging example
* Fix logger example
* Fix type 
						
					 
					
						2020-10-03 14:57:46 +02:00 
						 
				 
			
				
					
						
							
							
								Ines Montani 
							
						 
					 
					
						
						
						
						
							
						
						
							ae15c9de79 
							
						 
					 
					
						
						
							
							Raise error from caught KeyError to preserve traceback  
						
						
						
					 
					
						2020-10-03 11:43:56 +02:00 
						 
				 
			
				
					
						
							
							
								Ines Montani 
							
						 
					 
					
						
						
						
						
							
						
						
							f758804401 
							
						 
					 
					
						
						
							
							Save one line of code  
						
						
						
					 
					
						2020-10-03 11:41:28 +02:00 
						 
				 
			
				
					
						
							
							
								Stanislav Schmidt 
							
						 
					 
					
						
						
							
							
						
						
						
							
						
						
							3589a64d44 
							
						 
					 
					
						
						
							
							Change type of texts argument in pipe to iterable ( #6186 )  
						
						... 
						
						
						
						* Change type of texts argument in pipe to iterable
* Add contributor agreement 
						
					 
					
						2020-10-02 21:00:11 +02:00 
						 
				 
			
				
					
						
							
							
								svlandeg 
							
						 
					 
					
						
						
						
						
							
						
						
							02247cccaf 
							
						 
					 
					
						
						
							
							Merge remote-tracking branch 'upstream/develop' into feature/small-fixes  
						
						
						
					 
					
						2020-10-02 20:48:11 +02:00 
						 
				 
			
				
					
						
							
							
								svlandeg 
							
						 
					 
					
						
						
						
						
							
						
						
							fb48de349c 
							
						 
					 
					
						
						
							
							bwd compat for pipe.begin_training  
						
						
						
					 
					
						2020-10-02 20:31:14 +02:00 
						 
				 
			
				
					
						
							
							
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							6965cdf16d 
							
						 
					 
					
						
						
							
							Fix comment  
						
						
						
					 
					
						2020-10-02 17:26:21 +02:00 
						 
				 
			
				
					
						
							
							
								Ines Montani 
							
						 
					 
					
						
						
							
							
						
						
						
							
						
						
							3cf10a0729 
							
						 
					 
					
						
						
							
							Merge pull request  #6183  from adrianeboyd/feature/quickstart-morphologizer  
						
						... 
						
						
						
						Add morphologizer to quickstart template 
						
					 
					
						2020-10-02 17:08:01 +02:00 
						 
				 
			
				
					
						
							
							
								Adriane Boyd 
							
						 
					 
					
						
						
							
							
						
						
						
							
						
						
							62ccd5c4df 
							
						 
					 
					
						
						
							
							Relax model meta performance schema ( #6185 )  
						
						... 
						
						
						
						Allow more embedded per_x in `ModelMetaSchema` 
						
					 
					
						2020-10-02 16:37:21 +02:00 
						 
				 
			
				
					
						
							
							
								Sofie Van Landeghem 
							
						 
					 
					
						
						
							
							
						
						
						
							
						
						
							09dcb75076 
							
						 
					 
					
						
						
							
							small UX fix for DocBin ( #6167 )  
						
						... 
						
						
						
						* add informative warning when messing up store_user_data DocBin flags
* add informative warning when messing up store_user_data DocBin flags
* cleanup test
* rename to patterns_path 
						
					 
					
						2020-10-02 15:43:32 +02:00 
						 
				 
			
				
					
						
							
							
								Ines Montani 
							
						 
					 
					
						
						
							
							
						
						
						
							
						
						
							f0b30aedad 
							
						 
					 
					
						
						
							
							Make lemmatizers use initialize logic ( #6182 )  
						
						... 
						
						
						
						* Make lemmatizer use initialize logic and tidy up
* Fix typo
* Raise for uninitialized tables 
						
					 
					
						2020-10-02 15:42:36 +02:00 
						 
				 
			
				
					
						
							
							
								Adriane Boyd 
							
						 
					 
					
						
						
						
						
							
						
						
							22158dc24a 
							
						 
					 
					
						
						
							
							Add morphologizer to quickstart template  
						
						
						
					 
					
						2020-10-02 15:06:16 +02:00 
						 
				 
			
				
					
						
							
							
								Ines Montani 
							
						 
					 
					
						
						
							
							
						
						
						
							
						
						
							d2aa662ab2 
							
						 
					 
					
						
						
							
							Merge pull request  #6179  from adrianeboyd/feature/token-morph-refactor-2 [ci skip]  
						
						
						
					 
					
						2020-10-02 12:10:27 +02:00 
						 
				 
			
				
					
						
							
							
								Ines Montani 
							
						 
					 
					
						
						
						
						
							
						
						
							c41a4332e4 
							
						 
					 
					
						
						
							
							Add test for custom data augmentation  
						
						
						
					 
					
						2020-10-02 11:37:56 +02:00 
						 
				 
			
				
					
						
							
							
								svlandeg 
							
						 
					 
					
						
						
						
						
							
						
						
							acc391c2a8 
							
						 
					 
					
						
						
							
							remove redundant str() call  
						
						
						
					 
					
						2020-10-02 11:05:59 +02:00 
						 
				 
			
				
					
						
							
							
								Ines Montani 
							
						 
					 
					
						
						
							
							
						
						
						
							
						
						
							3856048437 
							
						 
					 
					
						
						
							
							Merge pull request  #6178  from explosion/feature/file-readers  
						
						... 
						
						
						
						Integrate file readers via srsly, update orth_variants loading 
						
					 
					
						2020-10-02 10:26:09 +02:00 
						 
				 
			
				
					
						
							
							
								Adriane Boyd 
							
						 
					 
					
						
						
						
						
							
						
						
							f83dfe62da 
							
						 
					 
					
						
						
							
							Fix test  
						
						
						
					 
					
						2020-10-02 10:17:26 +02:00 
						 
				 
			
				
					
						
							
							
								Adriane Boyd 
							
						 
					 
					
						
						
						
						
							
						
						
							65dfaa4f4b 
							
						 
					 
					
						
						
							
							Also accept MorphAnalysis in set_morph  
						
						
						
					 
					
						2020-10-02 08:33:43 +02:00 
						 
				 
			
				
					
						
							
							
								Adriane Boyd 
							
						 
					 
					
						
						
						
						
							
						
						
							77e08c398f 
							
						 
					 
					
						
						
							
							Switch reset value for set_morph to None  
						
						
						
					 
					
						2020-10-02 08:25:15 +02:00 
						 
				 
			
				
					
						
							
							
								Ines Montani 
							
						 
					 
					
						
						
						
						
							
						
						
							568768643e 
							
						 
					 
					
						
						
							
							Increment version [ci skip]  
						
						
						
					 
					
						2020-10-02 01:50:13 +02:00 
						 
				 
			
				
					
						
							
							
								Ines Montani 
							
						 
					 
					
						
						
						
						
							
						
						
							01c1538c72 
							
						 
					 
					
						
						
							
							Integrate file readers  
						
						
						
					 
					
						2020-10-02 01:36:06 +02:00 
						 
				 
			
				
					
						
							
							
								Ines Montani 
							
						 
					 
					
						
						
						
						
							
						
						
							af282ae732 
							
						 
					 
					
						
						
							
							Fix import  
						
						
						
					 
					
						2020-10-02 01:12:34 +02:00 
						 
				 
			
				
					
						
							
							
								Ines Montani 
							
						 
					 
					
						
						
						
						
							
						
						
							e59ecb12c0 
							
						 
					 
					
						
						
							
							Auto-format  
						
						
						
					 
					
						2020-10-02 01:12:30 +02:00 
						 
				 
			
				
					
						
							
							
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							75a1569908 
							
						 
					 
					
						
						
							
							Merge  
						
						
						
					 
					
						2020-10-01 23:07:53 +02:00 
						 
				 
			
				
					
						
							
							
								Matthew Honnibal 
							
						 
					 
					
						
						
							
							
						
						
						
							
						
						
							300e5a9928 
							
						 
					 
					
						
						
							
							Avoid relying on NORM in default v3 models ( #6176 )  
						
						... 
						
						
						
						* Allow CharacterEmbed to specify feature
* Default to LOWER in character embed
* Update tok2vec
* Use LOWER, not NORM 
						
					 
					
						2020-10-01 23:05:55 +02:00 
						 
				 
			
				
					
						
							
							
								Ines Montani 
							
						 
					 
					
						
						
						
						
							
						
						
							5762876dcc 
							
						 
					 
					
						
						
							
							Update default config [ci skip]  
						
						
						
					 
					
						2020-10-01 22:27:37 +02:00 
						 
				 
			
				
					
						
							
							
								Adriane Boyd 
							
						 
					 
					
						
						
							
							
						
						
						
							
						
						
							86c3ec9c2b 
							
						 
					 
					
						
						
							
							Refactor Token morph setting ( #6175 )  
						
						... 
						
						
						
						* Refactor Token morph setting
* Remove `Token.morph_`
* Add `Token.set_morph()`
  * `0` resets `token.c.morph` to unset
  * Any other values are passed to `Morphology.add`
* Add token.morph setter to set from MorphAnalysis 
						
					 
					
						2020-10-01 22:21:46 +02:00 
						 
				 
			
				
					
						
							
							
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							b854bca15c 
							
						 
					 
					
						
						
							
							Default to LOWER in character embed  
						
						
						
					 
					
						2020-10-01 22:17:58 +02:00 
						 
				 
			
				
					
						
							
							
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							684a77870b 
							
						 
					 
					
						
						
							
							Allow CharacterEmbed to specify feature  
						
						
						
					 
					
						2020-10-01 22:17:26 +02:00 
						 
				 
			
				
					
						
							
							
								Ines Montani 
							
						 
					 
					
						
						
						
						
							
						
						
							da30701cd1 
							
						 
					 
					
						
						
							
							Increment version [ci skip]  
						
						
						
					 
					
						2020-10-01 21:58:11 +02:00 
						 
				 
			
				
					
						
							
							
								Ines Montani 
							
						 
					 
					
						
						
						
						
							
						
						
							d48ddd6c9a 
							
						 
					 
					
						
						
							
							Remove default initialize lookups  
						
						
						
					 
					
						2020-10-01 21:54:33 +02:00 
						 
				 
			
				
					
						
							
							
								Ines Montani 
							
						 
					 
					
						
						
						
						
							
						
						
							1700c8541e 
							
						 
					 
					
						
						
							
							Increment version [ci skip]  
						
						
						
					 
					
						2020-10-01 17:57:16 +02:00 
						 
				 
			
				
					
						
							
							
								Ines Montani 
							
						 
					 
					
						
						
						
						
							
						
						
							f2627157c8 
							
						 
					 
					
						
						
							
							Update docs [ci skip]  
						
						
						
					 
					
						2020-10-01 17:38:17 +02:00 
						 
				 
			
				
					
						
							
							
								Ines Montani 
							
						 
					 
					
						
						
						
						
							
						
						
							7f68f4bd92 
							
						 
					 
					
						
						
							
							Hide jsonl_loc on init vectors and tidy up [ci skip]  
						
						
						
					 
					
						2020-10-01 16:44:17 +02:00 
						 
				 
			
				
					
						
							
							
								Adriane Boyd 
							
						 
					 
					
						
						
							
							
						
						
						
							
						
						
							27cbffff1b 
							
						 
					 
					
						
						
							
							Minor edit to CoNLL-U converter ( #6172 )  
						
						... 
						
						
						
						This doesn't make a difference given how the `merged_morph` values
override the `morph` values for all the final docs, but could have led
to unexpected bugs in the future if the converter is modified. 
						
					 
					
						2020-10-01 16:23:42 +02:00 
						 
				 
			
				
					
						
							
							
								Sofie Van Landeghem 
							
						 
					 
					
						
						
							
							
						
						
						
							
						
						
							a22215f427 
							
						 
					 
					
						
						
							
							Add FeatureExtractor from Thinc ( #6170 )  
						
						... 
						
						
						
						* move featureextractor from Thinc
* Update website/docs/api/architectures.md
Co-authored-by: Ines Montani <ines@ines.io>
* Update website/docs/api/architectures.md
Co-authored-by: Ines Montani <ines@ines.io>
Co-authored-by: Ines Montani <ines@ines.io> 
						
					 
					
						2020-10-01 16:22:48 +02:00 
						 
				 
			
				
					
						
							
							
								Adriane Boyd 
							
						 
					 
					
						
						
							
							
						
						
						
							
						
						
							73538782a0 
							
						 
					 
					
						
						
							
							Switch Doc.__init__(ents=) to IOB tags ( #6173 )  
						
						... 
						
						
						
						* Switch Doc.__init__(ents=) to IOB tags
* Fix check for "-"
* Allow "" or None as missing IOB tag 
						
					 
					
						2020-10-01 16:22:18 +02:00 
						 
				 
			
				
					
						
							
							
								Adriane Boyd 
							
						 
					 
					
						
						
							
							
						
						
						
							
						
						
							df98d3ef9f 
							
						 
					 
					
						
						
							
							Update import from collections.abc ( #6174 )  
						
						
						
					 
					
						2020-10-01 16:21:49 +02:00 
						 
				 
			
				
					
						
							
							
								Yohei Tamura 
							
						 
					 
					
						
						
							
							
						
						
						
							
						
						
							3243ddac8f 
							
						 
					 
					
						
						
							
							Fix/span.sent ( #6083 )  
						
						... 
						
						
						
						* add fail test
* fix test
* fix span.sent
* Remove incorrect implicit check
Co-authored-by: Adriane Boyd <adrianeboyd@gmail.com> 
						
					 
					
						2020-10-01 14:01:52 +02:00 
						 
				 
			
				
					
						
							
							
								Ines Montani 
							
						 
					 
					
						
						
						
						
							
						
						
							0a8a124a6e 
							
						 
					 
					
						
						
							
							Update docs [ci skip]  
						
						
						
					 
					
						2020-10-01 12:15:53 +02:00 
						 
				 
			
				
					
						
							
							
								Ines Montani 
							
						 
					 
					
						
						
						
						
							
						
						
							44160cd52f 
							
						 
					 
					
						
						
							
							Tidy up [ci skip]  
						
						
						
					 
					
						2020-10-01 10:41:19 +02:00 
						 
				 
			
				
					
						
							
							
								Ines Montani 
							
						 
					 
					
						
						
							
							
						
						
						
							
						
						
							381258b75b 
							
						 
					 
					
						
						
							
							Merge pull request  #6165  from explosion/feature/update-tokenizers-initialize  
						
						
						
					 
					
						2020-10-01 09:49:47 +02:00 
						 
				 
			
				
					
						
							
							
								svlandeg 
							
						 
					 
					
						
						
						
						
							
						
						
							6787e56315 
							
						 
					 
					
						
						
							
							print debugging warning before raising error if model not properly initialized  
						
						
						
					 
					
						2020-10-01 09:21:00 +02:00 
						 
				 
			
				
					
						
							
							
								svlandeg 
							
						 
					 
					
						
						
						
						
							
						
						
							5121972930 
							
						 
					 
					
						
						
							
							add types of Tok2Vec embedding layers  
						
						
						
					 
					
						2020-10-01 09:20:09 +02:00 
						 
				 
			
				
					
						
							
							
								Ines Montani 
							
						 
					 
					
						
						
						
						
							
						
						
							4b6afd3611 
							
						 
					 
					
						
						
							
							Remove English [initialize] default block for now to get tests to pass  
						
						
						
					 
					
						2020-09-30 23:49:29 +02:00 
						 
				 
			
				
					
						
							
							
								Ines Montani 
							
						 
					 
					
						
						
						
						
							
						
						
							6f29f68f69 
							
						 
					 
					
						
						
							
							Update errors and make Tokenizer.initialize args less strict  
						
						
						
					 
					
						2020-09-30 23:48:47 +02:00 
						 
				 
			
				
					
						
							
							
								Ines Montani 
							
						 
					 
					
						
						
						
						
							
						
						
							a103ab5f1a 
							
						 
					 
					
						
						
							
							Update augmenter lookups and docs  
						
						
						
					 
					
						2020-09-30 23:03:47 +02:00 
						 
				 
			
				
					
						
							
							
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							5128298964 
							
						 
					 
					
						
						
							
							Add missing augmenter  
						
						
						
					 
					
						2020-09-30 20:18:45 +02:00 
						 
				 
			
				
					
						
							
							
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							59294e91aa 
							
						 
					 
					
						
						
							
							Restore the 'jsonl' arg for init vectors  
						
						... 
						
						
						
						The lexemes.jsonl file is still used in our English vectors, and it may
be required by users as well. I think it's worth supporting the option. 
						
					 
					
						2020-09-30 19:06:50 +02:00 
						 
				 
			
				
					
						
							
							
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							c379a4274a 
							
						 
					 
					
						
						
							
							Merge branch 'develop' of  https://github.com/explosion/spaCy  into develop  
						
						
						
					 
					
						2020-09-30 16:52:42 +02:00 
						 
				 
			
				
					
						
							
							
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							e58dca3028 
							
						 
					 
					
						
						
							
							Add read_labels  
						
						
						
					 
					
						2020-09-30 16:52:27 +02:00 
						 
				 
			
				
					
						
							
							
								Ines Montani 
							
						 
					 
					
						
						
						
						
							
						
						
							23c63eefaf 
							
						 
					 
					
						
						
							
							Tidy up env vars [ci skip]  
						
						
						
					 
					
						2020-09-30 15:15:11 +02:00 
						 
				 
			
				
					
						
							
							
								Elijah Rippeth 
							
						 
					 
					
						
						
							
							
						
						
						
							
						
						
							4cbb954281 
							
						 
					 
					
						
						
							
							reorder so tagmap is replaced only if a custom file is provided. ( #6164 )  
						
						... 
						
						
						
						* reorder so tagmap is replaced only if a custom file is provided.
* Remove unneeded variable initialization
Co-authored-by: Adriane Boyd <adrianeboyd@gmail.com> 
						
					 
					
						2020-09-30 13:26:06 +02:00 
						 
				 
			
				
					
						
							
							
								Adriane Boyd 
							
						 
					 
					
						
						
						
						
							
						
						
							6b7bb32834 
							
						 
					 
					
						
						
							
							Refactor Chinese initialization  
						
						
						
					 
					
						2020-09-30 11:46:45 +02:00 
						 
				 
			
				
					
						
							
							
								Ines Montani 
							
						 
					 
					
						
						
						
						
							
						
						
							34f9c26c62 
							
						 
					 
					
						
						
							
							Add lexeme norm defaults  
						
						
						
					 
					
						2020-09-30 10:20:14 +02:00 
						 
				 
			
				
					
						
							
							
								Ines Montani 
							
						 
					 
					
						
						
						
						
							
						
						
							a5debb356d 
							
						 
					 
					
						
						
							
							Tidy up and adjust logging [ci skip]  
						
						
						
					 
					
						2020-09-30 01:22:08 +02:00 
						 
				 
			
				
					
						
							
							
								Ines Montani 
							
						 
					 
					
						
						
						
						
							
						
						
							56a2f778c4 
							
						 
					 
					
						
						
							
							Add logging [ci skip]  
						
						
						
					 
					
						2020-09-30 01:08:55 +02:00 
						 
				 
			
				
					
						
							
							
								Ines Montani 
							
						 
					 
					
						
						
							
							
						
						
						
							
						
						
							fe3f111c37 
							
						 
					 
					
						
						
							
							Merge pull request  #6168  from explosion/fix/default-corpus-values  
						
						
						
					 
					
						2020-09-30 00:24:02 +02:00 
						 
				 
			
				
					
						
							
							
								Ines Montani 
							
						 
					 
					
						
						
						
						
							
						
						
							b799af16de 
							
						 
					 
					
						
						
							
							Don't raise in Pipe.initialize if not implemented  
						
						
						
					 
					
						2020-09-30 00:05:27 +02:00 
						 
				 
			
				
					
						
							
							
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							bc61691f6f 
							
						 
					 
					
						
						
							
							Merge branch 'develop' of  https://github.com/explosion/spaCy  into develop  
						
						
						
					 
					
						2020-09-29 23:41:04 +02:00 
						 
				 
			
				
					
						
							
							
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							f52249fe2e 
							
						 
					 
					
						
						
							
							Fix data augmentation  
						
						
						
					 
					
						2020-09-29 23:40:54 +02:00 
						 
				 
			
				
					
						
							
							
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							14c4da547f 
							
						 
					 
					
						
						
							
							Try to fix augmentation  
						
						
						
					 
					
						2020-09-29 23:08:56 +02:00 
						 
				 
			
				
					
						
							
							
								Ines Montani 
							
						 
					 
					
						
						
						
						
							
						
						
							ae51843468 
							
						 
					 
					
						
						
							
							Remove augmenter from jinja template [ci skip]  
						
						
						
					 
					
						2020-09-29 23:08:50 +02:00 
						 
				 
			
				
					
						
							
							
								Ines Montani 
							
						 
					 
					
						
						
						
						
							
						
						
							9bb958fd0a 
							
						 
					 
					
						
						
							
							Fix debug data [ci skip]  
						
						
						
					 
					
						2020-09-29 23:07:11 +02:00 
						 
				 
			
				
					
						
							
							
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							a2aa1f6882 
							
						 
					 
					
						
						
							
							Disable the OVL augmentation by default  
						
						
						
					 
					
						2020-09-29 23:02:40 +02:00 
						 
				 
			
				
					
						
							
							
								Ines Montani 
							
						 
					 
					
						
						
						
						
							
						
						
							df8dd91b6f 
							
						 
					 
					
						
						
							
							Merge branch 'develop' into fix/default-corpus-values  
						
						
						
					 
					
						2020-09-29 22:55:39 +02:00 
						 
				 
			
				
					
						
							
							
								Ines Montani 
							
						 
					 
					
						
						
						
						
							
						
						
							0a1ee109db 
							
						 
					 
					
						
						
							
							Remove init form path  
						
						
						
					 
					
						2020-09-29 22:53:18 +02:00 
						 
				 
			
				
					
						
							
							
								Ines Montani 
							
						 
					 
					
						
						
						
						
							
						
						
							ad6d40d028 
							
						 
					 
					
						
						
							
							Add logging  
						
						
						
					 
					
						2020-09-29 22:53:14 +02:00 
						 
				 
			
				
					
						
							
							
								Ines Montani 
							
						 
					 
					
						
						
						
						
							
						
						
							c334a7d45f 
							
						 
					 
					
						
						
							
							Remove  
						
						
						
					 
					
						2020-09-29 22:38:39 +02:00 
						 
				 
			
				
					
						
							
							
								Ines Montani 
							
						 
					 
					
						
						
						
						
							
						
						
							1aeef3bfbb 
							
						 
					 
					
						
						
							
							Make corpus paths default to None and improve errors  
						
						
						
					 
					
						2020-09-29 22:33:46 +02:00 
						 
				 
			
				
					
						
							
							
								Ines Montani 
							
						 
					 
					
						
						
						
						
							
						
						
							0250bcf6a3 
							
						 
					 
					
						
						
							
							Show validation error during init  
						
						
						
					 
					
						2020-09-29 22:29:09 +02:00 
						 
				 
			
				
					
						
							
							
								Ines Montani 
							
						 
					 
					
						
						
						
						
							
						
						
							da30bae8a6 
							
						 
					 
					
						
						
							
							Use __pyx_vtable__ instead of __reduce_cython__  
						
						
						
					 
					
						2020-09-29 22:04:17 +02:00 
						 
				 
			
				
					
						
							
							
								Ines Montani 
							
						 
					 
					
						
						
						
						
							
						
						
							43c92ec8c9 
							
						 
					 
					
						
						
							
							Resolve dir for better output [ci skip]  
						
						
						
					 
					
						2020-09-29 22:01:04 +02:00 
						 
				 
			
				
					
						
							
							
								Ines Montani 
							
						 
					 
					
						
						
						
						
							
						
						
							fa47f87924 
							
						 
					 
					
						
						
							
							Tidy up and auto-format  
						
						
						
					 
					
						2020-09-29 21:39:28 +02:00 
						 
				 
			
				
					
						
							
							
								Ines Montani 
							
						 
					 
					
						
						
						
						
							
						
						
							604be54a5c 
							
						 
					 
					
						
						
							
							Support --code in evaluate CLI [ci skip]  
						
						
						
					 
					
						2020-09-29 21:20:56 +02:00 
						 
				 
			
				
					
						
							
							
								Ines Montani 
							
						 
					 
					
						
						
						
						
							
						
						
							6467a560e3 
							
						 
					 
					
						
						
							
							WIP: Test updating Chinese tokenizer  
						
						
						
					 
					
						2020-09-29 21:10:22 +02:00 
						 
				 
			
				
					
						
							
							
								Ines Montani 
							
						 
					 
					
						
						
						
						
							
						
						
							4f3102d09c 
							
						 
					 
					
						
						
							
							Auto-format  
						
						
						
					 
					
						2020-09-29 21:09:10 +02:00 
						 
				 
			
				
					
						
							
							
								Ines Montani 
							
						 
					 
					
						
						
						
						
							
						
						
							798040bc1d 
							
						 
					 
					
						
						
							
							Fix language detection  
						
						
						
					 
					
						2020-09-29 21:08:13 +02:00 
						 
				 
			
				
					
						
							
							
								Ines Montani 
							
						 
					 
					
						
						
							
							
						
						
						
							
						
						
							78021089f9 
							
						 
					 
					
						
						
							
							Merge pull request  #6160  from explosion/feature/prepare  
						
						
						
					 
					
						2020-09-29 20:55:13 +02:00 
						 
				 
			
				
					
						
							
							
								Ines Montani 
							
						 
					 
					
						
						
							
							
						
						
						
							
						
						
							c3f8c09d7d 
							
						 
					 
					
						
						
							
							Merge pull request  #6154  from adrianeboyd/bugfix/chinese-tokenizer-pickle  
						
						
						
					 
					
						2020-09-29 20:54:59 +02:00 
						 
				 
			
				
					
						
							
							
								Ines Montani 
							
						 
					 
					
						
						
						
						
							
						
						
							d3c63b7965 
							
						 
					 
					
						
						
							
							Merge branch 'develop' into feature/prepare  
						
						
						
					 
					
						2020-09-29 20:53:05 +02:00 
						 
				 
			
				
					
						
							
							
								Ines Montani 
							
						 
					 
					
						
						
						
						
							
						
						
							2be80379ec 
							
						 
					 
					
						
						
							
							Fix small issues, resolve_dot_names and debug model  
						
						
						
					 
					
						2020-09-29 20:38:35 +02:00 
						 
				 
			
				
					
						
							
							
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							a4da3120b4 
							
						 
					 
					
						
						
							
							Fix multitasks  
						
						
						
					 
					
						2020-09-29 18:33:16 +02:00 
						 
				 
			
				
					
						
							
							
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							0b5c72fce2 
							
						 
					 
					
						
						
							
							Fix incorrect docstrings  
						
						
						
					 
					
						2020-09-29 18:30:38 +02:00 
						 
				 
			
				
					
						
							
							
								Ines Montani 
							
						 
					 
					
						
						
						
						
							
						
						
							7851020653 
							
						 
					 
					
						
						
							
							Update tests  
						
						
						
					 
					
						2020-09-29 18:14:15 +02:00 
						 
				 
			
				
					
						
							
							
								Ines Montani 
							
						 
					 
					
						
						
						
						
							
						
						
							71a0ee274a 
							
						 
					 
					
						
						
							
							Move init labels to init pipeline module  
						
						
						
					 
					
						2020-09-29 18:09:33 +02:00 
						 
				 
			
				
					
						
							
							
								Ines Montani 
							
						 
					 
					
						
						
						
						
							
						
						
							dba26186ef 
							
						 
					 
					
						
						
							
							Handle None default args in Cython methods  
						
						
						
					 
					
						2020-09-29 18:08:02 +02:00 
						 
				 
			
				
					
						
							
							
								Ines Montani 
							
						 
					 
					
						
						
						
						
							
						
						
							9353a82076 
							
						 
					 
					
						
						
							
							Auto-format  
						
						
						
					 
					
						2020-09-29 18:07:48 +02:00 
						 
				 
			
				
					
						
							
							
								Ines Montani 
							
						 
					 
					
						
						
						
						
							
						
						
							534e1ef498 
							
						 
					 
					
						
						
							
							Fix template  
						
						
						
					 
					
						2020-09-29 17:02:55 +02:00 
						 
				 
			
				
					
						
							
							
								Ines Montani 
							
						 
					 
					
						
						
						
						
							
						
						
							f2352eb701 
							
						 
					 
					
						
						
							
							Test with default value  
						
						
						
					 
					
						2020-09-29 17:00:40 +02:00 
						 
				 
			
				
					
						
							
							
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							8ce9f44433 
							
						 
					 
					
						
						
							
							Merge branch 'feature/prepare' of  https://github.com/explosion/spaCy  into feature/prepare  
						
						
						
					 
					
						2020-09-29 16:57:38 +02:00 
						 
				 
			
				
					
						
							
							
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							e4f535a964 
							
						 
					 
					
						
						
							
							Fix Pipe.labels  
						
						
						
					 
					
						2020-09-29 16:55:07 +02:00 
						 
				 
			
				
					
						
							
							
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							4ad26f4a2f 
							
						 
					 
					
						
						
							
							Move reader  
						
						
						
					 
					
						2020-09-29 16:54:53 +02:00 
						 
				 
			
				
					
						
							
							
								Ines Montani 
							
						 
					 
					
						
						
						
						
							
						
						
							30c76dbd67 
							
						 
					 
					
						
						
							
							Merge branch 'feature/prepare' of  https://github.com/explosion/spaCy  into feature/prepare  
						
						
						
					 
					
						2020-09-29 16:53:48 +02:00 
						 
				 
			
				
					
						
							
							
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							43fc7a316d 
							
						 
					 
					
						
						
							
							Add registry function for reading jsonl  
						
						
						
					 
					
						2020-09-29 16:49:09 +02:00 
						 
				 
			
				
					
						
							
							
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							1fd002180e 
							
						 
					 
					
						
						
							
							Allow more components to use labels  
						
						
						
					 
					
						2020-09-29 16:48:56 +02:00 
						 
				 
			
				
					
						
							
							
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							99bff78617 
							
						 
					 
					
						
						
							
							Use labels in tagger  
						
						
						
					 
					
						2020-09-29 16:48:44 +02:00 
						 
				 
			
				
					
						
							
							
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							ca72608059 
							
						 
					 
					
						
						
							
							Fix language  
						
						
						
					 
					
						2020-09-29 16:48:33 +02:00 
						 
				 
			
				
					
						
							
							
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							10847c7f4e 
							
						 
					 
					
						
						
							
							Fix arg  
						
						
						
					 
					
						2020-09-29 16:48:07 +02:00 
						 
				 
			
				
					
						
							
							
								Ines Montani 
							
						 
					 
					
						
						
						
						
							
						
						
							fd594cfb9b 
							
						 
					 
					
						
						
							
							Tighten up format  
						
						
						
					 
					
						2020-09-29 16:47:55 +02:00 
						 
				 
			
				
					
						
							
							
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							e70a00fa76 
							
						 
					 
					
						
						
							
							Remove unnecessary warning from train  
						
						
						
					 
					
						2020-09-29 16:47:54 +02:00 
						 
				 
			
				
					
						
							
							
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							3f0d61232d 
							
						 
					 
					
						
						
							
							Remove outdated arg from train  
						
						
						
					 
					
						2020-09-29 16:47:44 +02:00 
						 
				 
			
				
					
						
							
							
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							e957d66b92 
							
						 
					 
					
						
						
							
							Merge branch 'feature/prepare' of  https://github.com/explosion/spaCy  into feature/prepare  
						
						
						
					 
					
						2020-09-29 16:22:53 +02:00 
						 
				 
			
				
					
						
							
							
								Ines Montani 
							
						 
					 
					
						
						
						
						
							
						
						
							978ab54a84 
							
						 
					 
					
						
						
							
							Fix logging  
						
						
						
					 
					
						2020-09-29 16:22:41 +02:00 
						 
				 
			
				
					
						
							
							
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							45daf5c9fe 
							
						 
					 
					
						
						
							
							Add init labels command  
						
						
						
					 
					
						2020-09-29 16:22:37 +02:00 
						 
				 
			
				
					
						
							
							
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							58c8d4b414 
							
						 
					 
					
						
						
							
							Add label_data property to pipeline  
						
						
						
					 
					
						2020-09-29 16:22:13 +02:00 
						 
				 
			
				
					
						
							
							
								Ines Montani 
							
						 
					 
					
						
						
						
						
							
						
						
							aa2a6882d0 
							
						 
					 
					
						
						
							
							Fix logging  
						
						
						
					 
					
						2020-09-29 16:08:39 +02:00 
						 
				 
			
				
					
						
							
							
								Ines Montani 
							
						 
					 
					
						
						
						
						
							
						
						
							63d1598137 
							
						 
					 
					
						
						
							
							Simplify config use in Language.initialize  
						
						
						
					 
					
						2020-09-29 16:05:48 +02:00 
						 
				 
			
				
					
						
							
							
								Ines Montani 
							
						 
					 
					
						
						
						
						
							
						
						
							56f8bc73ef 
							
						 
					 
					
						
						
							
							Add more tests  
						
						
						
					 
					
						2020-09-29 15:23:34 +02:00 
						 
				 
			
				
					
						
							
							
								Sofie Van Landeghem 
							
						 
					 
					
						
						
							
							
						
						
						
							
						
						
							6a04e5adea 
							
						 
					 
					
						
						
							
							encoding UTF8 ( #6161 )  
						
						
						
					 
					
						2020-09-29 14:49:55 +02:00 
						 
				 
			
				
					
						
							
							
								Ines Montani 
							
						 
					 
					
						
						
						
						
							
						
						
							591038b1a4 
							
						 
					 
					
						
						
							
							Add test  
						
						
						
					 
					
						2020-09-29 12:54:52 +02:00 
						 
				 
			
				
					
						
							
							
								Ines Montani 
							
						 
					 
					
						
						
						
						
							
						
						
							adca08a12f 
							
						 
					 
					
						
						
							
							Pass nlp forward  
						
						
						
					 
					
						2020-09-29 12:21:52 +02:00 
						 
				 
			
				
					
						
							
							
								Ines Montani 
							
						 
					 
					
						
						
						
						
							
						
						
							f171903139 
							
						 
					 
					
						
						
							
							Clean up sgd and pipeline -> nlp  
						
						
						
					 
					
						2020-09-29 12:20:26 +02:00 
						 
				 
			
				
					
						
							
							
								Ines Montani 
							
						 
					 
					
						
						
						
						
							
						
						
							612bbf85ab 
							
						 
					 
					
						
						
							
							Update initialize.py  
						
						
						
					 
					
						2020-09-29 12:14:47 +02:00 
						 
				 
			
				
					
						
							
							
								Ines Montani 
							
						 
					 
					
						
						
						
						
							
						
						
							42f0e4c946 
							
						 
					 
					
						
						
							
							Clean up  
						
						
						
					 
					
						2020-09-29 12:14:08 +02:00 
						 
				 
			
				
					
						
							
							
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							9c8b2524fe 
							
						 
					 
					
						
						
							
							Upd initialize args  
						
						
						
					 
					
						2020-09-29 12:08:37 +02:00 
						 
				 
			
				
					
						
							
							
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							e1fdf2b7c5 
							
						 
					 
					
						
						
							
							Upd tests  
						
						
						
					 
					
						2020-09-29 12:05:38 +02:00 
						 
				 
			
				
					
						
							
							
								Ines Montani 
							
						 
					 
					
						
						
						
						
							
						
						
							50410c17ac 
							
						 
					 
					
						
						
							
							Update schemas.py  
						
						
						
					 
					
						2020-09-29 12:05:38 +02:00 
						 
				 
			
				
					
						
							
							
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							f2d1b7feb5 
							
						 
					 
					
						
						
							
							Clean up sgd  
						
						
						
					 
					
						2020-09-29 12:00:08 +02:00 
						 
				 
			
				
					
						
							
							
								Ines Montani 
							
						 
					 
					
						
						
						
						
							
						
						
							78396d137f 
							
						 
					 
					
						
						
							
							Integrate initialize settings  
						
						
						
					 
					
						2020-09-29 11:57:08 +02:00 
						 
				 
			
				
					
						
							
							
								Ines Montani 
							
						 
					 
					
						
						
						
						
							
						
						
							dec984a9c1 
							
						 
					 
					
						
						
							
							Update Language.initialize and support components/tokenizer settings  
						
						
						
					 
					
						2020-09-29 11:52:45 +02:00 
						 
				 
			
				
					
						
							
							
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							b3b6868639 
							
						 
					 
					
						
						
							
							Remove 'sgd' arg from component initialize  
						
						
						
					 
					
						2020-09-29 11:42:35 +02:00 
						 
				 
			
				
					
						
							
							
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							5276db6f3f 
							
						 
					 
					
						
						
							
							Remove 'device' argument from Language, clean up 'sgd' arg  
						
						
						
					 
					
						2020-09-29 11:42:19 +02:00 
						 
				 
			
				
					
						
							
							
								Ines Montani 
							
						 
					 
					
						
						
						
						
							
						
						
							4925ad760a 
							
						 
					 
					
						
						
							
							Add init vectors  
						
						
						
					 
					
						2020-09-29 10:58:50 +02:00 
						 
				 
			
				
					
						
							
							
								svlandeg 
							
						 
					 
					
						
						
						
						
							
						
						
							64d90039a1 
							
						 
					 
					
						
						
							
							encoding UTF8  
						
						
						
					 
					
						2020-09-29 10:54:42 +02:00 
						 
				 
			
				
					
						
							
							
								Ines Montani 
							
						 
					 
					
						
						
						
						
							
						
						
							ff9a63bfbd 
							
						 
					 
					
						
						
							
							begin_training -> initialize  
						
						
						
					 
					
						2020-09-28 21:35:09 +02:00 
						 
				 
			
				
					
						
							
							
								Ines Montani 
							
						 
					 
					
						
						
						
						
							
						
						
							046f655d86 
							
						 
					 
					
						
						
							
							Fix error  
						
						
						
					 
					
						2020-09-28 21:17:45 +02:00 
						 
				 
			
				
					
						
							
							
								Ines Montani 
							
						 
					 
					
						
						
						
						
							
						
						
							a139fe672b 
							
						 
					 
					
						
						
							
							Fix typos and refactor CLI logging  
						
						
						
					 
					
						2020-09-28 21:17:10 +02:00 
						 
				 
			
				
					
						
							
							
								Ines Montani 
							
						 
					 
					
						
						
						
						
							
						
						
							2e9c9e74af 
							
						 
					 
					
						
						
							
							Fix config resolution and interpolation  
						
						... 
						
						
						
						TODO: auto-interpolate in Thinc if config is dict (i.e. likely subsection) 
						
					 
					
						2020-09-28 15:34:00 +02:00 
						 
				 
			
				
					
						
							
							
								Ines Montani 
							
						 
					 
					
						
						
						
						
							
						
						
							02838a1d47 
							
						 
					 
					
						
						
							
							Fix resolve_dot_names  
						
						
						
					 
					
						2020-09-28 15:27:10 +02:00 
						 
				 
			
				
					
						
							
							
								Ines Montani 
							
						 
					 
					
						
						
						
						
							
						
						
							822ea4ef61 
							
						 
					 
					
						
						
							
							Refactor CLI  
						
						
						
					 
					
						2020-09-28 15:09:59 +02:00 
						 
				 
			
				
					
						
							
							
								Ines Montani 
							
						 
					 
					
						
						
						
						
							
						
						
							a89e0ff7cb 
							
						 
					 
					
						
						
							
							Fix typo  
						
						
						
					 
					
						2020-09-28 12:55:21 +02:00 
						 
				 
			
				
					
						
							
							
								Ines Montani 
							
						 
					 
					
						
						
						
						
							
						
						
							a62337b3f3 
							
						 
					 
					
						
						
							
							Tidy up vocab init  
						
						
						
					 
					
						2020-09-28 12:53:06 +02:00 
						 
				 
			
				
					
						
							
							
								Ines Montani 
							
						 
					 
					
						
						
						
						
							
						
						
							c22ecc66bb 
							
						 
					 
					
						
						
							
							Don't support init path for now  
						
						
						
					 
					
						2020-09-28 12:46:28 +02:00 
						 
				 
			
				
					
						
							
							
								Ines Montani 
							
						 
					 
					
						
						
						
						
							
						
						
							f49288ab81 
							
						 
					 
					
						
						
							
							Update default_config_pretraining.cfg  
						
						
						
					 
					
						2020-09-28 12:31:54 +02:00 
						 
				 
			
				
					
						
							
							
								Ines Montani 
							
						 
					 
					
						
						
						
						
							
						
						
							a5f2cc0509 
							
						 
					 
					
						
						
							
							Tidy up and remove raw text (rehearsal) for now  
						
						
						
					 
					
						2020-09-28 12:30:13 +02:00 
						 
				 
			
				
					
						
							
							
								Ines Montani 
							
						 
					 
					
						
						
						
						
							
						
						
							1590de11b1 
							
						 
					 
					
						
						
							
							Update config  
						
						
						
					 
					
						2020-09-28 12:05:23 +02:00 
						 
				 
			
				
					
						
							
							
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							9f6ad06452 
							
						 
					 
					
						
						
							
							Upd default config  
						
						
						
					 
					
						2020-09-28 12:00:23 +02:00 
						 
				 
			
				
					
						
							
							
								Ines Montani 
							
						 
					 
					
						
						
						
						
							
						
						
							e44a7519cd 
							
						 
					 
					
						
						
							
							Update CLI and add [initialize] block  
						
						
						
					 
					
						2020-09-28 11:56:14 +02:00 
						 
				 
			
				
					
						
							
							
								Ines Montani 
							
						 
					 
					
						
						
						
						
							
						
						
							d5155376fd 
							
						 
					 
					
						
						
							
							Update vocab init  
						
						
						
					 
					
						2020-09-28 11:30:18 +02:00 
						 
				 
			
				
					
						
							
							
								Ines Montani 
							
						 
					 
					
						
						
						
						
							
						
						
							8b74fd19df 
							
						 
					 
					
						
						
							
							init pipeline -> init nlp  
						
						
						
					 
					
						2020-09-28 11:13:38 +02:00 
						 
				 
			
				
					
						
							
							
								Ines Montani 
							
						 
					 
					
						
						
						
						
							
						
						
							2fdb7285a0 
							
						 
					 
					
						
						
							
							Update CLI  
						
						
						
					 
					
						2020-09-28 11:06:07 +02:00 
						 
				 
			
				
					
						
							
							
								Ines Montani 
							
						 
					 
					
						
						
						
						
							
						
						
							553bfea641 
							
						 
					 
					
						
						
							
							Fix commands  
						
						
						
					 
					
						2020-09-28 10:53:17 +02:00 
						 
				 
			
				
					
						
							
							
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							44bad1474c 
							
						 
					 
					
						
						
							
							Add init_pipeline file  
						
						
						
					 
					
						2020-09-28 09:47:34 +02:00 
						 
				 
			
				
					
						
							
							
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							65448b2e34 
							
						 
					 
					
						
						
							
							Remove schema=None until Optional  
						
						
						
					 
					
						2020-09-28 03:42:58 +02:00 
						 
				 
			
				
					
						
							
							
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							b886f53c31 
							
						 
					 
					
						
						
							
							init-pipeline runs (maybe doesnt work)  
						
						
						
					 
					
						2020-09-28 03:42:47 +02:00 
						 
				 
			
				
					
						
							
							
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							ed2aff2db3 
							
						 
					 
					
						
						
							
							Remove unused train code  
						
						
						
					 
					
						2020-09-28 03:12:31 +02:00 
						 
				 
			
				
					
						
							
							
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							3a0a3b8db6 
							
						 
					 
					
						
						
							
							Dont hard-code for 'corpora' name  
						
						
						
					 
					
						2020-09-28 03:06:33 +02:00 
						 
				 
			
				
					
						
							
							
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							a023cf3ecc 
							
						 
					 
					
						
						
							
							Add (untested) resolve_dot_names util  
						
						
						
					 
					
						2020-09-28 03:06:12 +02:00 
						 
				 
			
				
					
						
							
							
								Matthew Honnibal 
							
						 
					 
					
						
						
							
							
						
						
						
							
						
						
							a976da168c 
							
						 
					 
					
						
						
							
							Support data augmentation in Corpus ( #6155 )  
						
						... 
						
						
						
						* Support data augmentation in Corpus
* Note initial docs for data augmentation
* Add augmenter to quickstart
* Fix flake8
* Format
* Fix test
* Update spacy/tests/training/test_training.py
* Improve data augmentation arguments
* Update templates
* Move randomization out into caller
* Refactor
* Update spacy/training/augment.py
* Update spacy/tests/training/test_training.py
* Fix augment
* Fix test 
						
					 
					
						2020-09-28 03:03:27 +02:00 
						 
				 
			
				
					
						
							
							
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							13b1605ee6 
							
						 
					 
					
						
						
							
							Add init script  
						
						
						
					 
					
						2020-09-28 01:08:49 +02:00 
						 
				 
			
				
					
						
							
							
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							a3e1791c9c 
							
						 
					 
					
						
						
							
							Upd train  
						
						
						
					 
					
						2020-09-28 01:08:30 +02:00 
						 
				 
			
				
					
						
							
							
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							b5556093e2 
							
						 
					 
					
						
						
							
							Start updating train script  
						
						
						
					 
					
						2020-09-27 23:59:44 +02:00 
						 
				 
			
				
					
						
							
							
								Ines Montani 
							
						 
					 
					
						
						
						
						
							
						
						
							9016d23cc5 
							
						 
					 
					
						
						
							
							Fix exclude and add test  
						
						
						
					 
					
						2020-09-27 23:34:03 +02:00 
						 
				 
			
				
					
						
							
							
								Ines Montani 
							
						 
					 
					
						
						
						
						
							
						
						
							658fad428a 
							
						 
					 
					
						
						
							
							Fix base schema integration  
						
						
						
					 
					
						2020-09-27 22:50:36 +02:00 
						 
				 
			
				
					
						
							
							
								Ines Montani 
							
						 
					 
					
						
						
						
						
							
						
						
							e04bd16f7f 
							
						 
					 
					
						
						
							
							Merge branch 'develop' into feature/new-thinc-config-resolution  
						
						
						
					 
					
						2020-09-27 22:34:46 +02:00 
						 
				 
			
				
					
						
							
							
								Ines Montani 
							
						 
					 
					
						
						
						
						
							
						
						
							d7ad65a9bb 
							
						 
					 
					
						
						
							
							Fix handling of error description [ci skip]  
						
						
						
					 
					
						2020-09-27 22:31:57 +02:00 
						 
				 
			
				
					
						
							
							
								Ines Montani 
							
						 
					 
					
						
						
						
						
							
						
						
							7e938ed63e 
							
						 
					 
					
						
						
							
							Update config resolution to use new Thinc  
						
						
						
					 
					
						2020-09-27 22:21:31 +02:00 
						 
				 
			
				
					
						
							
							
								Adriane Boyd 
							
						 
					 
					
						
						
							
							
						
						
						
							
						
						
							013b66de05 
							
						 
					 
					
						
						
							
							Add tokenizer scoring to ja / ko / zh ( #6152 )  
						
						
						
					 
					
						2020-09-27 22:20:45 +02:00 
						 
				 
			
				
					
						
							
							
								Adriane Boyd 
							
						 
					 
					
						
						
							
							
						
						
						
							
						
						
							a6548ead17 
							
						 
					 
					
						
						
							
							Add _ as a symbol ( #6153 )  
						
						... 
						
						
						
						* Add _ to StringStore in Morphology
* Add _ as a symbol
Add `_` as a symbol instead of adding to the `StringStore`. 
						
					 
					
						2020-09-27 22:20:14 +02:00 
						 
				 
			
				
					
						
							
							
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							39b178999c 
							
						 
					 
					
						
						
							
							Tmp notes  
						
						
						
					 
					
						2020-09-27 20:13:38 +02:00 
						 
				 
			
				
					
						
							
							
								Adriane Boyd 
							
						 
					 
					
						
						
						
						
							
						
						
							8393dbedad 
							
						 
					 
					
						
						
							
							Minor fixes  
						
						... 
						
						
						
						* Put `cfg` back in serialization
* Add `pickle5` to pytest conf 
						
					 
					
						2020-09-27 15:15:53 +02:00 
						 
				 
			
				
					
						
							
							
								Adriane Boyd 
							
						 
					 
					
						
						
						
						
							
						
						
							54fe871935 
							
						 
					 
					
						
						
							
							Fix formatting, refactor pickle5 exceptions  
						
						
						
					 
					
						2020-09-27 14:37:28 +02:00 
						 
				 
			
				
					
						
							
							
								Adriane Boyd 
							
						 
					 
					
						
						
						
						
							
						
						
							11e195d3ed 
							
						 
					 
					
						
						
							
							Update ChineseTokenizer  
						
						... 
						
						
						
						* Allow `pkuseg_model` to be set to `None` on initialization
* Don't save config within tokenizer
* Force convert pkuseg_model to use pickle protocol 4 by reencoding with
`pickle5` on serialization
* Update pkuseg serialization test 
						
					 
					
						2020-09-27 14:00:18 +02:00 
						 
				 
			
				
					
						
							
							
								Ines Montani 
							
						 
					 
					
						
						
						
						
							
						
						
							b4486d747d 
							
						 
					 
					
						
						
							
							Merge branch 'develop' into fix/train-config-interpolation  
						
						
						
					 
					
						2020-09-26 15:32:14 +02:00 
						 
				 
			
				
					
						
							
							
								Ines Montani 
							
						 
					 
					
						
						
							
							
						
						
						
							
						
						
							8fea06d55e 
							
						 
					 
					
						
						
							
							Merge pull request  #6149  from adrianeboyd/feature/attributeruler-match-ids  
						
						... 
						
						
						
						Simplify string match IDs for AttributeRuler 
						
					 
					
						2020-09-26 15:31:30 +02:00 
						 
				 
			
				
					
						
							
							
								Ines Montani 
							
						 
					 
					
						
						
						
						
							
						
						
							b2d07de786 
							
						 
					 
					
						
						
							
							Construct nlp from uninterpolated config before training  
						
						
						
					 
					
						2020-09-26 15:16:59 +02:00 
						 
				 
			
				
					
						
							
							
								Ines Montani 
							
						 
					 
					
						
						
						
						
							
						
						
							ca3c997062 
							
						 
					 
					
						
						
							
							Improve CLI config validation with latest Thinc  
						
						
						
					 
					
						2020-09-26 13:13:57 +02:00 
						 
				 
			
				
					
						
							
							
								Adriane Boyd 
							
						 
					 
					
						
						
						
						
							
						
						
							6c25e60089 
							
						 
					 
					
						
						
							
							Simplify string match IDs for AttributeRuler  
						
						
						
					 
					
						2020-09-26 11:12:39 +02:00 
						 
				 
			
				
					
						
							
							
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							702edf52a0 
							
						 
					 
					
						
						
							
							Fix attributeruler  
						
						
						
					 
					
						2020-09-26 00:30:48 +02:00 
						 
				 
			
				
					
						
							
							
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							821f37254c 
							
						 
					 
					
						
						
							
							Fix attributeruler  
						
						
						
					 
					
						2020-09-26 00:19:53 +02:00 
						 
				 
			
				
					
						
							
							
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							98327f66a9 
							
						 
					 
					
						
						
							
							Fix attributeruler key  
						
						
						
					 
					
						2020-09-25 23:20:50 +02:00 
						 
				 
			
				
					
						
							
							
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							092ce4648e 
							
						 
					 
					
						
						
							
							Make DocBin output stable data (set iteration)  
						
						
						
					 
					
						2020-09-25 22:20:44 +02:00 
						 
				 
			
				
					
						
							
							
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							26afd3bd90 
							
						 
					 
					
						
						
							
							Fix iteration order  
						
						
						
					 
					
						2020-09-25 21:47:22 +02:00 
						 
				 
			
				
					
						
							
							
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							3d8388969e 
							
						 
					 
					
						
						
							
							Sort paths for cache consistency  
						
						
						
					 
					
						2020-09-25 19:07:26 +02:00 
						 
				 
			
				
					
						
							
							
								Adriane Boyd 
							
						 
					 
					
						
						
							
							
						
						
						
							
						
						
							c3b5a3cfff 
							
						 
					 
					
						
						
							
							Clean up MorphAnalysisC struct ( #6146 )  
						
						
						
					 
					
						2020-09-25 15:56:48 +02:00 
						 
				 
			
				
					
						
							
							
								Sofie Van Landeghem 
							
						 
					 
					
						
						
							
							
						
						
						
							
						
						
							009ba14aaf 
							
						 
					 
					
						
						
							
							Fix pretraining in train script ( #6143 )  
						
						... 
						
						
						
						* update pretraining API in train CLI
* bump thinc to 8.0.0a35
* bump to 3.0.0a26
* doc fixes
* small doc fix 
						
					 
					
						2020-09-25 15:47:10 +02:00 
						 
				 
			
				
					
						
							
							
								Adriane Boyd 
							
						 
					 
					
						
						
						
						
							
						
						
							50f20cf722 
							
						 
					 
					
						
						
							
							Revert changes to Scorer.score_spans  
						
						
						
					 
					
						2020-09-25 08:21:47 +02:00 
						 
				 
			
				
					
						
							
							
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							93d7ff309f 
							
						 
					 
					
						
						
							
							Remove print  
						
						
						
					 
					
						2020-09-24 21:05:27 +02:00 
						 
				 
			
				
					
						
							
							
								Matthew Honnibal 
							
						 
					 
					
						
						
							
							
						
						
						
							
						
						
							16475528f7 
							
						 
					 
					
						
						
							
							Fix skipped documents in entity scorer ( #6137 )  
						
						... 
						
						
						
						* Fix skipped documents in entity scorer
* Add back the skipping of unannotated entities
* Update spacy/scorer.py
* Use more specific NER scorer
* Fix import
* Fix get_ner_prf
* Add scorer
* Fix scorer
Co-authored-by: Ines Montani <ines@ines.io> 
						
					 
					
						2020-09-24 20:38:57 +02:00 
						 
				 
			
				
					
						
							
							
								Matthew Honnibal 
							
						 
					 
					
						
						
							
							
						
						
						
							
						
						
							2abb4ba9db 
							
						 
					 
					
						
						
							
							Make a pre-check to speed up alignment cache ( #6139 )  
						
						... 
						
						
						
						* Dirty trick to fast-track alignment cache
* Improve alignment cache check
* Fix header
* Fix align cache
* Fix align logic 
						
					 
					
						2020-09-24 18:13:39 +02:00 
						 
				 
			
				
					
						
							
							
								Ines Montani 
							
						 
					 
					
						
						
						
						
							
						
						
							26e28ed413 
							
						 
					 
					
						
						
							
							Fix combined scores if multiple components report it  
						
						
						
					 
					
						2020-09-24 17:11:13 +02:00 
						 
				 
			
				
					
						
							
							
								Ines Montani 
							
						 
					 
					
						
						
						
						
							
						
						
							0b52b6904c 
							
						 
					 
					
						
						
							
							Update entity_linker.py  
						
						
						
					 
					
						2020-09-24 17:10:35 +02:00 
						 
				 
			
				
					
						
							
							
								Ines Montani 
							
						 
					 
					
						
						
						
						
							
						
						
							20b89a9717 
							
						 
					 
					
						
						
							
							Increment version [ci skip]  
						
						
						
					 
					
						2020-09-24 16:57:02 +02:00 
						 
				 
			
				
					
						
							
							
								Adriane Boyd 
							
						 
					 
					
						
						
							
							
						
						
						
							
						
						
							3c062b3911 
							
						 
					 
					
						
						
							
							Add MORPH handling to Matcher ( #6107 )  
						
						... 
						
						
						
						* Add MORPH handling to Matcher
* Add `MORPH` to `Matcher` schema
* Rename `_SetMemberPredicate` to `_SetPredicate`
* Add `ISSUBSET` and `ISSUPERSET` operators to `_SetPredicate`
  * Add special handling for normalization and conversion of morph
    values into sets
  * For other attrs, `ISSUBSET` acts like `IN` and `ISSUPERSET` only
    matches for 0 or 1 values
* Update test
* Rename to IS_SUBSET and IS_SUPERSET 
						
					 
					
						2020-09-24 16:55:09 +02:00 
						 
				 
			
				
					
						
							
							
								Adriane Boyd 
							
						 
					 
					
						
						
							
							
						
						
						
							
						
						
							59340606b7 
							
						 
					 
					
						
						
							
							Add option to disable Matcher errors ( #6125 )  
						
						... 
						
						
						
						* Add option to disable Matcher errors
* Add option to disable Matcher errors when a doc doesn't contain a
particular type of annotation
Minor additional change:
* Update `AttributeRuler.load_from_morph_rules` to allow direct `MORPH`
values
* Rename suppress_errors to allow_missing
Co-authored-by: Matthew Honnibal <honnibal+gh@gmail.com>
* Refactor annotation checks in Matcher and PhraseMatcher
Co-authored-by: Matthew Honnibal <honnibal+gh@gmail.com> 
						
					 
					
						2020-09-24 16:54:39 +02:00 
						 
				 
			
				
					
						
							
							
								Sofie Van Landeghem 
							
						 
					 
					
						
						
							
							
						
						
						
							
						
						
							c7eedd3534 
							
						 
					 
					
						
						
							
							updates to NEL functionality ( #6132 )  
						
						... 
						
						
						
						* NEL: read sentences and ents from reference
* fiddling with sent_start annotations
* add KB serialization test
* KB write additional file with strings.json
* score_links function to calculate NEL P/R/F
* formatting
* documentation 
						
					 
					
						2020-09-24 16:53:59 +02:00 
						 
				 
			
				
					
						
							
							
								Ines Montani 
							
						 
					 
					
						
						
						
						
							
						
						
							d0ef4a4cf5 
							
						 
					 
					
						
						
							
							Prevent division by zero in score weights  
						
						
						
					 
					
						2020-09-24 16:42:13 +02:00 
						 
				 
			
				
					
						
							
							
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							74ee456374 
							
						 
					 
					
						
						
							
							Merge branch 'develop' of  https://github.com/explosion/spaCy  into develop  
						
						
						
					 
					
						2020-09-24 16:11:47 +02:00 
						 
				 
			
				
					
						
							
							
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							0bc214c102 
							
						 
					 
					
						
						
							
							Fix pull  
						
						
						
					 
					
						2020-09-24 16:11:33 +02:00 
						 
				 
			
				
					
						
							
							
								Ines Montani 
							
						 
					 
					
						
						
						
						
							
						
						
							3f751e68f5 
							
						 
					 
					
						
						
							
							Increment version [ci skip]  
						
						
						
					 
					
						2020-09-24 14:45:41 +02:00 
						 
				 
			
				
					
						
							
							
								Ines Montani 
							
						 
					 
					
						
						
							
							
						
						
						
							
						
						
							58dde293ce 
							
						 
					 
					
						
						
							
							Merge pull request  #6089  from adrianeboyd/feature/doc-ents-v3-2  
						
						
						
					 
					
						2020-09-24 14:44:42 +02:00 
						 
				 
			
				
					
						
							
							
								Ines Montani 
							
						 
					 
					
						
						
							
							
						
						
						
							
						
						
							74e1f192b4 
							
						 
					 
					
						
						
							
							Merge pull request  #6134  from explosion/feature/training_before_to_disk  
						
						
						
					 
					
						2020-09-24 14:44:11 +02:00 
						 
				 
			
				
					
						
							
							
								Ines Montani 
							
						 
					 
					
						
						
						
						
							
						
						
							24e7ac3f2b 
							
						 
					 
					
						
						
							
							Fix download CLI [ci skip]  
						
						
						
					 
					
						2020-09-24 14:43:56 +02:00 
						 
				 
			
				
					
						
							
							
								Ines Montani 
							
						 
					 
					
						
						
						
						
							
						
						
							88e54caa12 
							
						 
					 
					
						
						
							
							accuracy -> performance  
						
						
						
					 
					
						2020-09-24 14:32:35 +02:00 
						 
				 
			
				
					
						
							
							
								Ines Montani 
							
						 
					 
					
						
						
						
						
							
						
						
							92f8b6959a 
							
						 
					 
					
						
						
							
							Fix typo  
						
						
						
					 
					
						2020-09-24 13:48:41 +02:00 
						 
				 
			
				
					
						
							
							
								Adriane Boyd 
							
						 
					 
					
						
						
						
						
							
						
						
							5c13e0cf1b 
							
						 
					 
					
						
						
							
							Remove unused error  
						
						
						
					 
					
						2020-09-24 13:41:55 +02:00 
						 
				 
			
				
					
						
							
							
								Ines Montani 
							
						 
					 
					
						
						
						
						
							
						
						
							be56c0994b 
							
						 
					 
					
						
						
							
							Add [training.before_to_disk] callback  
						
						
						
					 
					
						2020-09-24 12:40:25 +02:00 
						 
				 
			
				
					
						
							
							
								Adriane Boyd 
							
						 
					 
					
						
						
						
						
							
						
						
							8eaacaae97 
							
						 
					 
					
						
						
							
							Refactor Doc.ents setter to use Doc.set_ents  
						
						... 
						
						
						
						Additional changes:
* Entity spans with missing labels are ignored
* Fix ent_kb_id setting in `Doc.set_ents` 
						
					 
					
						2020-09-24 12:36:51 +02:00 
						 
				 
			
				
					
						
							
							
								Ines Montani 
							
						 
					 
					
						
						
							
							
						
						
						
							
						
						
							c6c67b606e 
							
						 
					 
					
						
						
							
							Merge pull request  #6133  from explosion/fix/score_weights  
						
						
						
					 
					
						2020-09-24 12:00:57 +02:00 
						 
				 
			
				
					
						
							
							
								Ines Montani 
							
						 
					 
					
						
						
						
						
							
						
						
							f69fea8b25 
							
						 
					 
					
						
						
							
							Improve error handling around non-number scores  
						
						
						
					 
					
						2020-09-24 11:29:07 +02:00 
						 
				 
			
				
					
						
							
							
								Ines Montani 
							
						 
					 
					
						
						
						
						
							
						
						
							4eb39b5c43 
							
						 
					 
					
						
						
							
							Fix logging  
						
						
						
					 
					
						2020-09-24 11:04:35 +02:00 
						 
				 
			
				
					
						
							
							
								Ines Montani 
							
						 
					 
					
						
						
						
						
							
						
						
							4bbe41f017 
							
						 
					 
					
						
						
							
							Fix combined scores and update test  
						
						
						
					 
					
						2020-09-24 10:42:47 +02:00 
						 
				 
			
				
					
						
							
							
								Sofie Van Landeghem 
							
						 
					 
					
						
						
							
							
						
						
						
							
						
						
							c645c4e7ce 
							
						 
					 
					
						
						
							
							fix micro PRF for textcat ( #6130 )  
						
						... 
						
						
						
						* fix micro PRF for textcat
* small fix 
						
					 
					
						2020-09-24 10:31:17 +02:00 
						 
				 
			
				
					
						
							
							
								Matthew Honnibal 
							
						 
					 
					
						
						
							
							
						
						
						
							
						
						
							17a6b0a173 
							
						 
					 
					
						
						
							
							Make project pull order insensitive ( #6131 )  
						
						
						
					 
					
						2020-09-24 10:30:42 +02:00 
						 
				 
			
				
					
						
							
							
								Ines Montani 
							
						 
					 
					
						
						
						
						
							
						
						
							ae51f580c1 
							
						 
					 
					
						
						
							
							Fix handling of score_weights  
						
						
						
					 
					
						2020-09-24 10:27:33 +02:00 
						 
				 
			
				
					
						
							
							
								Ines Montani 
							
						 
					 
					
						
						
						
						
							
						
						
							f25f05c503 
							
						 
					 
					
						
						
							
							Adjust sort order [ci skip]  
						
						
						
					 
					
						2020-09-23 20:03:04 +02:00 
						 
				 
			
				
					
						
							
							
								Ines Montani 
							
						 
					 
					
						
						
						
						
							
						
						
							3f77eb749c 
							
						 
					 
					
						
						
							
							Increment version [ci skip]  
						
						
						
					 
					
						2020-09-23 19:50:15 +02:00 
						 
				 
			
				
					
						
							
							
								svlandeg 
							
						 
					 
					
						
						
						
						
							
						
						
							b816ace4bb 
							
						 
					 
					
						
						
							
							format  
						
						
						
					 
					
						2020-09-23 17:33:13 +02:00 
						 
				 
			
				
					
						
							
							
								svlandeg 
							
						 
					 
					
						
						
						
						
							
						
						
							5a9fdbc8ad 
							
						 
					 
					
						
						
							
							state_type as Literal  
						
						
						
					 
					
						2020-09-23 17:32:14 +02:00 
						 
				 
			
				
					
						
							
							
								svlandeg 
							
						 
					 
					
						
						
						
						
							
						
						
							35dbc63578 
							
						 
					 
					
						
						
							
							Merge remote-tracking branch 'upstream/develop' into fix/nr_features  
						
						... 
						
						
						
						# Conflicts:
#	spacy/ml/models/parser.py
#	spacy/tests/serialize/test_serialize_config.py
#	website/docs/api/architectures.md 
						
					 
					
						2020-09-23 17:01:13 +02:00 
						 
				 
			
				
					
						
							
							
								svlandeg 
							
						 
					 
					
						
						
						
						
							
						
						
							25b34bba94 
							
						 
					 
					
						
						
							
							throw custom error when state_type is invalid  
						
						
						
					 
					
						2020-09-23 16:57:14 +02:00 
						 
				 
			
				
					
						
							
							
								Ines Montani 
							
						 
					 
					
						
						
							
							
						
						
						
							
						
						
							916050bf2f 
							
						 
					 
					
						
						
							
							Merge pull request  #6127  from explosion/feature/literal-nr_feature_tokens  
						
						
						
					 
					
						2020-09-23 16:56:08 +02:00 
						 
				 
			
				
					
						
							
							
								Ines Montani 
							
						 
					 
					
						
						
						
						
							
						
						
							3c3863654e 
							
						 
					 
					
						
						
							
							Increment version [ci skip]  
						
						
						
					 
					
						2020-09-23 16:54:43 +02:00 
						 
				 
			
				
					
						
							
							
								svlandeg 
							
						 
					 
					
						
						
						
						
							
						
						
							dd2292793f 
							
						 
					 
					
						
						
							
							'parser' instead of 'deps' for state_type  
						
						
						
					 
					
						2020-09-23 16:53:49 +02:00 
						 
				 
			
				
					
						
							
							
								Ines Montani 
							
						 
					 
					
						
						
						
						
							
						
						
							50a4425cda 
							
						 
					 
					
						
						
							
							Adjust docs  
						
						
						
					 
					
						2020-09-23 16:03:32 +02:00 
						 
				 
			
				
					
						
							
							
								Ines Montani 
							
						 
					 
					
						
						
						
						
							
						
						
							76bbed3466 
							
						 
					 
					
						
						
							
							Use Literal type for nr_feature_tokens  
						
						
						
					 
					
						2020-09-23 16:00:03 +02:00 
						 
				 
			
				
					
						
							
							
								Muhammad Fahmi Rasyid 
							
						 
					 
					
						
						
							
							
						
						
						
							
						
						
							7489d02dea 
							
						 
					 
					
						
						
							
							Update Indonesian Example Phrases   ( #6124 )  
						
						... 
						
						
						
						* create contributor agreement
* Update Indonesian example. (see  #1107 )
Update Indonesian examples with more proper phrases. the current phrases contains sensitive and violent words. 
						
					 
					
						2020-09-23 14:02:26 +02:00 
						 
				 
			
				
					
						
							
							
								svlandeg 
							
						 
					 
					
						
						
						
						
							
						
						
							6c85fab316 
							
						 
					 
					
						
						
							
							state_type and extra_state_tokens instead of nr_feature_tokens  
						
						
						
					 
					
						2020-09-23 13:35:09 +02:00 
						 
				 
			
				
					
						
							
							
								Ines Montani 
							
						 
					 
					
						
						
						
						
							
						
						
							7745d77a38 
							
						 
					 
					
						
						
							
							Fix whitespace in template [ci skip]  
						
						
						
					 
					
						2020-09-23 13:21:42 +02:00 
						 
				 
			
				
					
						
							
							
								svlandeg 
							
						 
					 
					
						
						
						
						
							
						
						
							6435458d51 
							
						 
					 
					
						
						
							
							simplify expression  
						
						
						
					 
					
						2020-09-23 12:12:38 +02:00 
						 
				 
			
				
					
						
							
							
								svlandeg 
							
						 
					 
					
						
						
						
						
							
						
						
							20b0ec5dcf 
							
						 
					 
					
						
						
							
							avoid logging performance of frozen components  
						
						
						
					 
					
						2020-09-23 10:37:12 +02:00 
						 
				 
			
				
					
						
							
							
								Ines Montani 
							
						 
					 
					
						
						
						
						
							
						
						
							ae5dacf75f 
							
						 
					 
					
						
						
							
							Tidy up and add types  
						
						
						
					 
					
						2020-09-23 10:14:34 +02:00 
						 
				 
			
				
					
						
							
							
								Ines Montani 
							
						 
					 
					
						
						
						
						
							
						
						
							6ca06cb62c 
							
						 
					 
					
						
						
							
							Update docs and formatting [ci skip]  
						
						
						
					 
					
						2020-09-23 10:14:27 +02:00 
						 
				 
			
				
					
						
							
							
								Ines Montani 
							
						 
					 
					
						
						
							
							
						
						
						
							
						
						
							888f936a73 
							
						 
					 
					
						
						
							
							Merge pull request  #6106  from svlandeg/feature/textcat-quickstart  
						
						
						
					 
					
						2020-09-23 10:11:45 +02:00 
						 
				 
			
				
					
						
							
							
								Ines Montani 
							
						 
					 
					
						
						
							
							
						
						
						
							
						
						
							60a317520a 
							
						 
					 
					
						
						
							
							Merge pull request  #6109  from svlandeg/feature/2rename  
						
						
						
					 
					
						2020-09-23 09:47:12 +02:00 
						 
				 
			
				
					
						
							
							
								Ines Montani 
							
						 
					 
					
						
						
						
						
							
						
						
							f976bab710 
							
						 
					 
					
						
						
							
							Remove empty file [ci skip]  
						
						
						
					 
					
						2020-09-23 09:30:09 +02:00 
						 
				 
			
				
					
						
							
							
								svlandeg 
							
						 
					 
					
						
						
						
						
							
						
						
							556f3e4652 
							
						 
					 
					
						
						
							
							add pooling to NEL's TransformerListener  
						
						
						
					 
					
						2020-09-23 09:24:28 +02:00 
						 
				 
			
				
					
						
							
							
								svlandeg 
							
						 
					 
					
						
						
						
						
							
						
						
							4a56ea72b5 
							
						 
					 
					
						
						
							
							fallbacks for old names  
						
						
						
					 
					
						2020-09-23 09:15:07 +02:00 
						 
				 
			
				
					
						
							
							
								Sofie Van Landeghem 
							
						 
					 
					
						
						
							
							
						
						
						
							
						
						
							86a08f819d 
							
						 
					 
					
						
						
							
							tok2vec.update instead of predict ( #6113 )  
						
						
						
					 
					
						2020-09-22 21:54:52 +02:00 
						 
				 
			
				
					
						
							
							
								Adriane Boyd 
							
						 
					 
					
						
						
							
							
						
						
						
							
						
						
							e4acb28658 
							
						 
					 
					
						
						
							
							Fix norm in retokenizer split ( #6111 )  
						
						... 
						
						
						
						Parallel to behavior in merge, reset norm on original token in
retokenizer split. 
						
					 
					
						2020-09-22 21:53:33 +02:00 
						 
				 
			
				
					
						
							
							
								Sofie Van Landeghem 
							
						 
					 
					
						
						
							
							
						
						
						
							
						
						
							e0e793be4d 
							
						 
					 
					
						
						
							
							fix KB IO ( #6118 )  
						
						
						
					 
					
						2020-09-22 21:53:06 +02:00 
						 
				 
			
				
					
						
							
							
								Adriane Boyd 
							
						 
					 
					
						
						
							
							
						
						
						
							
						
						
							9b4979407d 
							
						 
					 
					
						
						
							
							Fix overlapping German noun chunks ( #6112 )  
						
						... 
						
						
						
						Add a similar fix as in #5470  to prevent the German noun chunks iterator
from producing overlapping spans. 
						
					 
					
						2020-09-22 21:52:42 +02:00 
						 
				 
			
				
					
						
							
							
								Adriane Boyd 
							
						 
					 
					
						
						
						
						
							
						
						
							b1a7d6c528 
							
						 
					 
					
						
						
							
							Refactor seen token detection  
						
						
						
					 
					
						2020-09-22 14:42:51 +02:00 
						 
				 
			
				
					
						
							
							
								Sofie Van Landeghem 
							
						 
					 
					
						
						
							
							
						
						
						
							
						
						
							d53c84b6d6 
							
						 
					 
					
						
						
							
							avoid None callback ( #6100 )  
						
						
						
					 
					
						2020-09-22 13:54:44 +02:00 
						 
				 
			
				
					
						
							
							
								Adriane Boyd 
							
						 
					 
					
						
						
							
							
						
						
						
							
						
						
							535842e483 
							
						 
					 
					
						
						
							
							Merge branch 'develop' into feature/doc-ents-v3-2  
						
						
						
					 
					
						2020-09-22 13:45:50 +02:00 
						 
				 
			
				
					
						
							
							
								Ines Montani 
							
						 
					 
					
						
						
						
						
							
						
						
							5e3b796b12 
							
						 
					 
					
						
						
							
							Validate section refs in debug config  
						
						
						
					 
					
						2020-09-22 12:24:39 +02:00 
						 
				 
			
				
					
						
							
							
								svlandeg 
							
						 
					 
					
						
						
						
						
							
						
						
							085a1c8e2b 
							
						 
					 
					
						
						
							
							add no_output_layer to TextCatBOW config  
						
						
						
					 
					
						2020-09-22 12:06:40 +02:00 
						 
				 
			
				
					
						
							
							
								svlandeg 
							
						 
					 
					
						
						
						
						
							
						
						
							e1b8090b9b 
							
						 
					 
					
						
						
							
							few more fixes  
						
						
						
					 
					
						2020-09-22 12:01:06 +02:00 
						 
				 
			
				
					
						
							
							
								svlandeg 
							
						 
					 
					
						
						
						
						
							
						
						
							b556a10808 
							
						 
					 
					
						
						
							
							rename converts in_to_out  
						
						
						
					 
					
						2020-09-22 11:50:19 +02:00 
						 
				 
			
				
					
						
							
							
								svlandeg 
							
						 
					 
					
						
						
						
						
							
						
						
							e931f4d757 
							
						 
					 
					
						
						
							
							add textcat score  
						
						
						
					 
					
						2020-09-22 10:56:43 +02:00 
						 
				 
			
				
					
						
							
							
								svlandeg 
							
						 
					 
					
						
						
						
						
							
						
						
							396b33257f 
							
						 
					 
					
						
						
							
							add entity_linker to jinja template  
						
						
						
					 
					
						2020-09-22 10:40:05 +02:00 
						 
				 
			
				
					
						
							
							
								Ines Montani 
							
						 
					 
					
						
						
						
						
							
						
						
							db7126ead9 
							
						 
					 
					
						
						
							
							Increment version  
						
						
						
					 
					
						2020-09-22 10:31:26 +02:00 
						 
				 
			
				
					
						
							
							
								svlandeg 
							
						 
					 
					
						
						
						
						
							
						
						
							135de82a2d 
							
						 
					 
					
						
						
							
							add textcat to quickstart  
						
						
						
					 
					
						2020-09-22 10:22:06 +02:00 
						 
				 
			
				
					
						
							
							
								Ines Montani 
							
						 
					 
					
						
						
						
						
							
						
						
							6316d5f398 
							
						 
					 
					
						
						
							
							Improve messages in project CLI [ci skip]  
						
						
						
					 
					
						2020-09-22 09:45:34 +02:00 
						 
				 
			
				
					
						
							
							
								Ines Montani 
							
						 
					 
					
						
						
							
							
						
						
						
							
						
						
							49e80dbcac 
							
						 
					 
					
						
						
							
							Merge pull request  #6103  from explosion/chore/tidy-up-tests-docs-get-doc  
						
						
						
					 
					
						2020-09-22 09:45:04 +02:00 
						 
				 
			
				
					
						
							
							
								Ines Montani 
							
						 
					 
					
						
						
							
							
						
						
						
							
						
						
							81606b29bd 
							
						 
					 
					
						
						
							
							Merge pull request  #6104  from svlandeg/fix/debug_model [ci skip]  
						
						
						
					 
					
						2020-09-22 09:31:23 +02:00 
						 
				 
			
				
					
						
							
							
								Ines Montani 
							
						 
					 
					
						
						
						
						
							
						
						
							beb766d0a0 
							
						 
					 
					
						
						
							
							Add test  
						
						
						
					 
					
						2020-09-22 09:15:57 +02:00 
						 
				 
			
				
					
						
							
							
								Ines Montani 
							
						 
					 
					
						
						
						
						
							
						
						
							285fa934d8 
							
						 
					 
					
						
						
							
							Merge branch 'chore/tidy-up-tests-docs-get-doc' of  https://github.com/explosion/spaCy  into chore/tidy-up-tests-docs-get-doc  
						
						
						
					 
					
						2020-09-22 09:10:14 +02:00 
						 
				 
			
				
					
						
							
							
								Ines Montani 
							
						 
					 
					
						
						
						
						
							
						
						
							69f7e52c26 
							
						 
					 
					
						
						
							
							Update README.md  
						
						
						
					 
					
						2020-09-22 09:10:06 +02:00 
						 
				 
			
				
					
						
							
							
								svlandeg 
							
						 
					 
					
						
						
						
						
							
						
						
							45b29c4a5b 
							
						 
					 
					
						
						
							
							cleanup  
						
						
						
					 
					
						2020-09-21 23:17:23 +02:00 
						 
				 
			
				
					
						
							
							
								svlandeg 
							
						 
					 
					
						
						
						
						
							
						
						
							fa5c416db6 
							
						 
					 
					
						
						
							
							initialize through nlp object and with train_corpus  
						
						
						
					 
					
						2020-09-21 23:09:22 +02:00 
						 
				 
			
				
					
						
							
							
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							3abc4a5adb 
							
						 
					 
					
						
						
							
							Slightly tidy doc.ents.__set__  
						
						
						
					 
					
						2020-09-21 22:58:03 +02:00 
						 
				 
			
				
					
						
							
							
								Ines Montani 
							
						 
					 
					
						
						
						
						
							
						
						
							67fbcb3da5 
							
						 
					 
					
						
						
							
							Tidy up tests and docs  
						
						
						
					 
					
						2020-09-21 20:43:54 +02:00 
						 
				 
			
				
					
						
							
							
								Ines Montani 
							
						 
					 
					
						
						
							
							
						
						
						
							
						
						
							a5f6ab4943 
							
						 
					 
					
						
						
							
							Merge pull request  #6098  from adrianeboyd/feature/doc-init  
						
						
						
					 
					
						2020-09-21 18:35:20 +02:00 
						 
				 
			
				
					
						
							
							
								Adriane Boyd 
							
						 
					 
					
						
						
						
						
							
						
						
							f212303729 
							
						 
					 
					
						
						
							
							Add sent_starts to Doc.__init__  
						
						... 
						
						
						
						Add sent_starts to `Doc.__init__`. Officially specify `is_sent_start`
values but also convert to and accept `sent_start` internally. 
						
					 
					
						2020-09-21 17:59:09 +02:00 
						 
				 
			
				
					
						
							
							
								svlandeg 
							
						 
					 
					
						
						
						
						
							
						
						
							447b3e5787 
							
						 
					 
					
						
						
							
							Merge remote-tracking branch 'upstream/develop' into fix/debug_model  
						
						... 
						
						
						
						# Conflicts:
#	spacy/cli/debug_model.py 
						
					 
					
						2020-09-21 16:58:40 +02:00 
						 
				 
			
				
					
						
							
							
								Ines Montani 
							
						 
					 
					
						
						
						
						
							
						
						
							b3327c1e45 
							
						 
					 
					
						
						
							
							Increment version [ci skip]  
						
						
						
					 
					
						2020-09-21 16:04:30 +02:00 
						 
				 
			
				
					
						
							
							
								Ines Montani 
							
						 
					 
					
						
						
						
						
							
						
						
							e8bcaa44f1 
							
						 
					 
					
						
						
							
							Don't auto-decompress archives with smart_open [ci skip]  
						
						
						
					 
					
						2020-09-21 16:01:46 +02:00 
						 
				 
			
				
					
						
							
							
								Adriane Boyd 
							
						 
					 
					
						
						
						
						
							
						
						
							6aa91c7ca0 
							
						 
					 
					
						
						
							
							Make user_data keyword-only  
						
						
						
					 
					
						2020-09-21 16:00:06 +02:00 
						 
				 
			
				
					
						
							
							
								Adriane Boyd 
							
						 
					 
					
						
						
						
						
							
						
						
							177df15d89 
							
						 
					 
					
						
						
							
							Implement Doc.set_ents  
						
						
						
					 
					
						2020-09-21 15:54:05 +02:00 
						 
				 
			
				
					
						
							
							
								Adriane Boyd 
							
						 
					 
					
						
						
						
						
							
						
						
							13fbf6556a 
							
						 
					 
					
						
						
							
							Merge remote-tracking branch 'upstream/develop' into feature/doc-ents-v3-2  
						
						
						
					 
					
						2020-09-21 14:42:04 +02:00 
						 
				 
			
				
					
						
							
							
								svlandeg 
							
						 
					 
					
						
						
						
						
							
						
						
							eb9b447960 
							
						 
					 
					
						
						
							
							Merge remote-tracking branch 'upstream/develop' into fix/debug_model  
						
						... 
						
						
						
						# Conflicts:
#	spacy/cli/debug_model.py 
						
					 
					
						2020-09-21 14:05:16 +02:00 
						 
				 
			
				
					
						
							
							
								Adriane Boyd 
							
						 
					 
					
						
						
						
						
							
						
						
							ce455f30ca 
							
						 
					 
					
						
						
							
							Fix formatting  
						
						
						
					 
					
						2020-09-21 13:53:29 +02:00 
						 
				 
			
				
					
						
							
							
								Adriane Boyd 
							
						 
					 
					
						
						
						
						
							
						
						
							bc02e86494 
							
						 
					 
					
						
						
							
							Extend Doc.__init__ with additional annotation  
						
						... 
						
						
						
						Mostly copying from `spacy.tests.util.get_doc`, add additional kwargs to
`Doc.__init__` to initialize the most common doc/token values. 
						
					 
					
						2020-09-21 13:36:24 +02:00 
						 
				 
			
				
					
						
							
							
								Ines Montani 
							
						 
					 
					
						
						
						
						
							
						
						
							758ead8a47 
							
						 
					 
					
						
						
							
							Sync overrides with CLI overrides  
						
						
						
					 
					
						2020-09-21 12:50:13 +02:00 
						 
				 
			
				
					
						
							
							
								Ines Montani 
							
						 
					 
					
						
						
						
						
							
						
						
							5497acf49a 
							
						 
					 
					
						
						
							
							Support config overrides via environment variables  
						
						
						
					 
					
						2020-09-21 11:25:10 +02:00 
						 
				 
			
				
					
						
							
							
								Ines Montani 
							
						 
					 
					
						
						
						
						
							
						
						
							1114219ae3 
							
						 
					 
					
						
						
							
							Tidy up and auto-format  
						
						
						
					 
					
						2020-09-21 10:59:07 +02:00 
						 
				 
			
				
					
						
							
							
								Ines Montani 
							
						 
					 
					
						
						
						
						
							
						
						
							b2302c0a1c 
							
						 
					 
					
						
						
							
							Improve error for missing dependency  
						
						
						
					 
					
						2020-09-20 17:44:51 +02:00 
						 
				 
			
				
					
						
							
							
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							8fb59d958c 
							
						 
					 
					
						
						
							
							Format  
						
						
						
					 
					
						2020-09-20 16:31:48 +02:00 
						 
				 
			
				
					
						
							
							
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							dc22771f87 
							
						 
					 
					
						
						
							
							Fix sparse checkout  
						
						
						
					 
					
						2020-09-20 16:30:05 +02:00 
						 
				 
			
				
					
						
							
							
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							a0fb5e50db 
							
						 
					 
					
						
						
							
							Use simple git clone call if not sparse  
						
						
						
					 
					
						2020-09-20 16:22:04 +02:00 
						 
				 
			
				
					
						
							
							
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							2c24d633d0 
							
						 
					 
					
						
						
							
							Use updated run_command  
						
						
						
					 
					
						2020-09-20 16:21:43 +02:00 
						 
				 
			
				
					
						
							
							
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							889128e5c5 
							
						 
					 
					
						
						
							
							Improve error handling in run_command  
						
						
						
					 
					
						2020-09-20 16:20:57 +02:00 
						 
				 
			
				
					
						
							
							
								Ines Montani 
							
						 
					 
					
						
						
						
						
							
						
						
							554c9a2497 
							
						 
					 
					
						
						
							
							Update docs [ci skip]  
						
						
						
					 
					
						2020-09-20 12:30:53 +02:00 
						 
				 
			
				
					
						
							
							
								svlandeg 
							
						 
					 
					
						
						
						
						
							
						
						
							6db1d5dc0d 
							
						 
					 
					
						
						
							
							trying some stuff  
						
						
						
					 
					
						2020-09-19 19:11:30 +02:00 
						 
				 
			
				
					
						
							
							
								Ines Montani 
							
						 
					 
					
						
						
							
							
						
						
						
							
						
						
							e863b3dc14 
							
						 
					 
					
						
						
							
							Merge pull request  #6092  from adrianeboyd/bugfix/load-vocab-lookups-2  
						
						
						
					 
					
						2020-09-19 12:33:38 +02:00 
						 
				 
			
				
					
						
							
							
								Sofie Van Landeghem 
							
						 
					 
					
						
						
							
							
						
						
						
							
						
						
							39872de1f6 
							
						 
					 
					
						
						
							
							Introducing the gpu_allocator ( #6091 )  
						
						... 
						
						
						
						* rename 'use_pytorch_for_gpu_memory' to 'gpu_allocator'
* --code instead of --code-path
* update documentation
* avoid querying the "system" section directly
* add explanation of gpu_allocator to TF/PyTorch section in docs
* fix typo
* fix typo 2
* use set_gpu_allocator from thinc 8.0.0a34
* default null instead of empty string 
						
					 
					
						2020-09-19 01:17:02 +02:00 
						 
				 
			
				
					
						
							
							
								Adriane Boyd 
							
						 
					 
					
						
						
						
						
							
						
						
							47080fba98 
							
						 
					 
					
						
						
							
							Minor renaming / refactoring  
						
						... 
						
						
						
						* Rename loader to `spacy.LookupsDataLoader.v1`, add debugging message
* Make `Vocab.lookups` a property 
						
					 
					
						2020-09-18 19:43:19 +02:00 
						 
				 
			
				
					
						
							
							
								svlandeg 
							
						 
					 
					
						
						
						
						
							
						
						
							73ff52b9ec 
							
						 
					 
					
						
						
							
							hack for tok2vec listener  
						
						
						
					 
					
						2020-09-18 16:43:15 +02:00 
						 
				 
			
				
					
						
							
							
								Adriane Boyd 
							
						 
					 
					
						
						
						
						
							
						
						
							eed4b785f5 
							
						 
					 
					
						
						
							
							Load vocab lookups tables at beginning of training  
						
						... 
						
						
						
						Similar to how vectors are handled, move the vocab lookups to be loaded
at the start of training rather than when the vocab is initialized,
since the vocab doesn't have access to the full config when it's
created.
The option moves from `nlp.load_vocab_data` to `training.lookups`.
Typically these tables will come from `spacy-lookups-data`, but any
`Lookups` object can be provided.
The loading from `spacy-lookups-data` is now strict, so configs for each
language should specify the exact tables required. This also makes it
easier to control whether the larger clusters and probs tables are
included.
To load `lexeme_norm` from `spacy-lookups-data`:
```
[training.lookups]
@misc = "spacy.LoadLookupsData.v1"
lang = ${nlp.lang}
tables = ["lexeme_norm"]
``` 
						
					 
					
						2020-09-18 15:59:16 +02:00 
						 
				 
			
				
					
						
							
							
								Ines Montani 
							
						 
					 
					
						
						
							
							
						
						
						
							
						
						
							a127fa475e 
							
						 
					 
					
						
						
							
							Merge pull request  #6078  from svlandeg/fix/corpus  
						
						
						
					 
					
						2020-09-18 14:44:21 +02:00 
						 
				 
			
				
					
						
							
							
								Matthew Honnibal 
							
						 
					 
					
						
						
							
							
						
						
						
							
						
						
							bbdb5f62b7 
							
						 
					 
					
						
						
							
							Temporary work-around for scoring a subset of components ( #6090 )  
						
						... 
						
						
						
						* Try hacking the scorer to work around sentence boundaries
* Upd scorer
* Set dev version
* Upd scorer hack
* Fix version
* Improve comment on hack 
						
					 
					
						2020-09-18 14:26:42 +02:00 
						 
				 
			
				
					
						
							
							
								Adriane Boyd 
							
						 
					 
					
						
						
							
							
						
						
						
							
						
						
							a88106e852 
							
						 
					 
					
						
						
							
							Remove W106: HEAD and SENT_START in doc.from_array ( #6086 )  
						
						... 
						
						
						
						* Remove W106: HEAD and SENT_START in doc.from_array
This warning was hacky and being triggered too often.
* Fix test 
						
					 
					
						2020-09-18 03:01:29 +02:00 
						 
				 
			
				
					
						
							
							
								svlandeg 
							
						 
					 
					
						
						
						
						
							
						
						
							e4fc7e0222 
							
						 
					 
					
						
						
							
							fixing output sample to proper 2D array  
						
						
						
					 
					
						2020-09-17 22:34:36 +02:00 
						 
				 
			
				
					
						
							
							
								Adriane Boyd 
							
						 
					 
					
						
						
						
						
							
						
						
							8b650f3a78 
							
						 
					 
					
						
						
							
							Modify setting missing and blocked entity tokens  
						
						... 
						
						
						
						In order to make it easier to construct `Doc` objects as training data,
modify how missing and blocked entity tokens are set to prioritize
setting `O` and missing entity tokens for training purposes over setting
blocked entity tokens.
* `Doc.ents` setter sets tokens outside entity spans to `O` regardless
of the current state of each token
* For `Doc.ents`, setting a span with a missing label sets the `ent_iob`
to missing instead of blocked
* `Doc.block_ents(spans)` marks spans as hard `O` for use with the
`EntityRecognizer` 
						
					 
					
						2020-09-17 21:27:42 +02:00 
						 
				 
			
				
					
						
							
							
								Ines Montani 
							
						 
					 
					
						
						
						
						
							
						
						
							3865214343 
							
						 
					 
					
						
						
							
							Use consistent shortcut  
						
						
						
					 
					
						2020-09-17 16:57:02 +02:00 
						 
				 
			
				
					
						
							
							
								svlandeg 
							
						 
					 
					
						
						
						
						
							
						
						
							35a3931064 
							
						 
					 
					
						
						
							
							fix typo  
						
						
						
					 
					
						2020-09-17 16:36:27 +02:00 
						 
				 
			
				
					
						
							
							
								svlandeg 
							
						 
					 
					
						
						
						
						
							
						
						
							ddfc1fc146 
							
						 
					 
					
						
						
							
							add pretraining option to init config  
						
						
						
					 
					
						2020-09-17 16:05:40 +02:00 
						 
				 
			
				
					
						
							
							
								svlandeg 
							
						 
					 
					
						
						
						
						
							
						
						
							427dbecdd6 
							
						 
					 
					
						
						
							
							cleanup and formatting  
						
						
						
					 
					
						2020-09-17 11:48:04 +02:00 
						 
				 
			
				
					
						
							
							
								svlandeg 
							
						 
					 
					
						
						
						
						
							
						
						
							0c35885751 
							
						 
					 
					
						
						
							
							generalize corpora, dot notation for dev and train corpus  
						
						
						
					 
					
						2020-09-17 11:38:59 +02:00 
						 
				 
			
				
					
						
							
							
								svlandeg 
							
						 
					 
					
						
						
						
						
							
						
						
							781fae678b 
							
						 
					 
					
						
						
							
							Merge remote-tracking branch 'upstream/develop' into fix/corpus  
						
						
						
					 
					
						2020-09-17 09:24:36 +02:00 
						 
				 
			
				
					
						
							
							
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							8303d101a5 
							
						 
					 
					
						
						
							
							Set version to v3.0.0a19  
						
						
						
					 
					
						2020-09-17 00:18:49 +02:00 
						 
				 
			
				
					
						
							
							
								Adriane Boyd 
							
						 
					 
					
						
						
							
							
						
						
						
							
						
						
							7e4cd7575c 
							
						 
					 
					
						
						
							
							Refactor Docs.is_ flags ( #6044 )  
						
						... 
						
						
						
						* Refactor Docs.is_ flags
* Add derived `Doc.has_annotation` method
  * `Doc.has_annotation(attr)` returns `True` for partial annotation
  * `Doc.has_annotation(attr, require_complete=True)` returns `True` for
    complete annotation
* Add deprecation warnings to `is_tagged`, `is_parsed`, `is_sentenced`
and `is_nered`
* Add `Doc._get_array_attrs()`, which returns a full list of `Doc` attrs
for use with `Doc.to_array`, `Doc.to_bytes` and `Doc.from_docs`. The
list is the `DocBin` attributes list plus `SPACY` and `LENGTH`.
Notes on `Doc.has_annotation`:
* `HEAD` is converted to `DEP` because heads don't have an unset state
* Accept `IS_SENT_START` as a synonym of `SENT_START`
Additional changes:
* Add `NORM`, `ENT_ID` and `SENT_START` to default attributes for
`DocBin`
* In `Doc.from_array()` the presence of `DEP` causes `HEAD` to override
`SENT_START`
* In `Doc.from_array()` using `attrs` other than
`Doc._get_array_attrs()` (i.e., a user's custom list rather than our
default internal list) with both `HEAD` and `SENT_START` shows a warning
that `HEAD` will override `SENT_START`
* `set_children_from_heads` does not require dependency labels to set
sentence boundaries and sets `sent_start` for all non-sentence starts to
`-1`
* Fix call to set_children_form_heads
Co-authored-by: Matthew Honnibal <honnibal+gh@gmail.com> 
						
					 
					
						2020-09-17 00:14:01 +02:00 
						 
				 
			
				
					
						
							
							
								Adriane Boyd 
							
						 
					 
					
						
						
							
							
						
						
						
							
						
						
							a119667a36 
							
						 
					 
					
						
						
							
							Clean up spacy.tokens ( #6046 )  
						
						... 
						
						
						
						* Clean up spacy.tokens
* Update `set_children_from_heads`:
  * Don't check `dep` when setting lr_* or sentence starts
  * Set all non-sentence starts to `False`
* Use `set_children_from_heads` in `Token.head` setter
  * Reduce similar/duplicate code (admittedly adds a bit of overhead)
  * Update sentence starts consistently
* Remove unused `Doc.set_parse`
* Minor changes:
  * Declare cython variables (to avoid cython warnings)
  * Clean up imports
* Modify set_children_from_heads to set token range
Modify `set_children_from_heads` so that it adjust tokens within a
specified range rather then the whole document.
Modify the `Token.head` setter to adjust only the tokens affected by the
new head assignment. 
						
					 
					
						2020-09-16 20:32:38 +02:00 
						 
				 
			
				
					
						
							
							
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							c776594ab1 
							
						 
					 
					
						
						
							
							Fix  
						
						
						
					 
					
						2020-09-16 18:15:14 +02:00 
						 
				 
			
				
					
						
							
							
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							4a573d18b3 
							
						 
					 
					
						
						
							
							Add comment  
						
						
						
					 
					
						2020-09-16 17:51:29 +02:00 
						 
				 
			
				
					
						
							
							
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							d31afc8334 
							
						 
					 
					
						
						
							
							Fix Language.link_components when model is None  
						
						
						
					 
					
						2020-09-16 17:49:48 +02:00 
						 
				 
			
				
					
						
							
							
								Adriane Boyd 
							
						 
					 
					
						
						
							
							
						
						
						
							
						
						
							f3db3f6fe0 
							
						 
					 
					
						
						
							
							Add vectors option to CharacterEmbed ( #6069 )  
						
						... 
						
						
						
						* Add vectors option to CharacterEmbed
* Update spacy/pipeline/morphologizer.pyx
* Adjust default morphologizer config
Co-authored-by: Matthew Honnibal <honnibal+gh@gmail.com> 
						
					 
					
						2020-09-16 17:45:04 +02:00 
						 
				 
			
				
					
						
							
							
								Adriane Boyd 
							
						 
					 
					
						
						
							
							
						
						
						
							
						
						
							d722a439aa 
							
						 
					 
					
						
						
							
							Remove unneeded methods in senter and morphologizer ( #6074 )  
						
						... 
						
						
						
						Now that the tagger doesn't manage the tag map, the child classes senter
and morphologizer don't need to override the serialization methods. 
						
					 
					
						2020-09-16 17:39:41 +02:00 
						 
				 
			
				
					
						
							
							
								Adriane Boyd 
							
						 
					 
					
						
						
							
							
						
						
						
							
						
						
							87c329c711 
							
						 
					 
					
						
						
							
							Set rule-based lemmatizers as default ( #6076 )  
						
						... 
						
						
						
						For languages without provided models and with lemmatizer rules in
`spacy-lookups-data`, make the rule-based lemmatizer the default:
Bengali, Persian, Norwegian, Swedish 
						
					 
					
						2020-09-16 17:37:29 +02:00 
						 
				 
			
				
					
						
							
							
								svlandeg 
							
						 
					 
					
						
						
						
						
							
						
						
							1040e250d8 
							
						 
					 
					
						
						
							
							actual commit with test for custom readers with ml_datasets >= 0.2  
						
						
						
					 
					
						2020-09-16 16:41:28 +02:00 
						 
				 
			
				
					
						
							
							
								svlandeg 
							
						 
					 
					
						
						
						
						
							
						
						
							714a5a05c6 
							
						 
					 
					
						
						
							
							test for custom readers with ml_datasets >= 0.2  
						
						
						
					 
					
						2020-09-16 16:39:55 +02:00 
						 
				 
			
				
					
						
							
							
								svlandeg 
							
						 
					 
					
						
						
						
						
							
						
						
							0d1392340f 
							
						 
					 
					
						
						
							
							Merge remote-tracking branch 'upstream/develop' into fix/corpus  
						
						
						
					 
					
						2020-09-15 23:17:08 +02:00 
						 
				 
			
				
					
						
							
							
								svlandeg 
							
						 
					 
					
						
						
						
						
							
						
						
							f420aa1138 
							
						 
					 
					
						
						
							
							use e.value to get to the ExceptionInfo value  
						
						
						
					 
					
						2020-09-15 22:30:09 +02:00 
						 
				 
			
				
					
						
							
							
								svlandeg 
							
						 
					 
					
						
						
						
						
							
						
						
							7336657662 
							
						 
					 
					
						
						
							
							corpus is a Dict  
						
						
						
					 
					
						2020-09-15 22:07:16 +02:00 
						 
				 
			
				
					
						
							
							
								svlandeg 
							
						 
					 
					
						
						
						
						
							
						
						
							51fa929f47 
							
						 
					 
					
						
						
							
							rewrite train_corpus to corpus.train in config  
						
						
						
					 
					
						2020-09-15 21:58:04 +02:00 
						 
				 
			
				
					
						
							
							
								svlandeg 
							
						 
					 
					
						
						
						
						
							
						
						
							bd87e8686e 
							
						 
					 
					
						
						
							
							move tests to correct subdir  
						
						
						
					 
					
						2020-09-15 21:40:38 +02:00 
						 
				 
			
				
					
						
							
							
								Ines Montani 
							
						 
					 
					
						
						
						
						
							
						
						
							aaf01689a1 
							
						 
					 
					
						
						
							
							Merge branch 'develop' of  https://github.com/explosion/spaCy  into develop  
						
						
						
					 
					
						2020-09-15 14:24:42 +02:00 
						 
				 
			
				
					
						
							
							
								Ines Montani 
							
						 
					 
					
						
						
						
						
							
						
						
							91a6637f74 
							
						 
					 
					
						
						
							
							Remove extra pipe config values before merging  
						
						
						
					 
					
						2020-09-15 14:24:17 +02:00 
						 
				 
			
				
					
						
							
							
								Ines Montani 
							
						 
					 
					
						
						
						
						
							
						
						
							d3d7f92f05 
							
						 
					 
					
						
						
							
							Fix lang check and error handling in Language.from_config  
						
						
						
					 
					
						2020-09-15 14:24:06 +02:00 
						 
				 
			
				
					
						
							
							
								Ines Montani 
							
						 
					 
					
						
						
						
						
							
						
						
							2ed6e2a218 
							
						 
					 
					
						
						
							
							Auto-format  
						
						
						
					 
					
						2020-09-15 14:20:04 +02:00 
						 
				 
			
				
					
						
							
							
								Ines Montani 
							
						 
					 
					
						
						
							
							
						
						
						
							
						
						
							2214d1bb7b 
							
						 
					 
					
						
						
							
							Merge pull request  #6067  from explosion/feature/spacy-blank-from-config  
						
						
						
					 
					
						2020-09-15 14:18:33 +02:00 
						 
				 
			
				
					
						
							
							
								Ines Montani 
							
						 
					 
					
						
						
						
						
							
						
						
							253ba5ef14 
							
						 
					 
					
						
						
							
							Raise for bad Vocab values  
						
						
						
					 
					
						2020-09-15 13:25:34 +02:00 
						 
				 
			
				
					
						
							
							
								svlandeg 
							
						 
					 
					
						
						
						
						
							
						
						
							7677e5c0e2 
							
						 
					 
					
						
						
							
							fix wandb logger when calling multiple times from same script  
						
						
						
					 
					
						2020-09-15 12:56:33 +02:00 
						 
				 
			
				
					
						
							
							
								Ines Montani 
							
						 
					 
					
						
						
						
						
							
						
						
							eff9406718 
							
						 
					 
					
						
						
							
							Support vocab arg in spacy.blank  
						
						
						
					 
					
						2020-09-15 11:39:36 +02:00 
						 
				 
			
				
					
						
							
							
								Ines Montani 
							
						 
					 
					
						
						
						
						
							
						
						
							99549a5ace 
							
						 
					 
					
						
						
							
							Fix consistency and update docs  
						
						
						
					 
					
						2020-09-15 11:37:37 +02:00 
						 
				 
			
				
					
						
							
							
								Ines Montani 
							
						 
					 
					
						
						
						
						
							
						
						
							7dfc4bc062 
							
						 
					 
					
						
						
							
							Allow overriding meta from spacy.blank  
						
						
						
					 
					
						2020-09-15 11:12:12 +02:00 
						 
				 
			
				
					
						
							
							
								Ines Montani 
							
						 
					 
					
						
						
						
						
							
						
						
							0f943157af 
							
						 
					 
					
						
						
							
							Delegate to Language.from_config in spacy.blank  
						
						
						
					 
					
						2020-09-15 11:07:55 +02:00 
						 
				 
			
				
					
						
							
							
								Ines Montani 
							
						 
					 
					
						
						
						
						
							
						
						
							e977086a9a 
							
						 
					 
					
						
						
							
							Update default pretraining config [ci skip]  
						
						
						
					 
					
						2020-09-15 01:12:02 +02:00 
						 
				 
			
				
					
						
							
							
								Ines Montani 
							
						 
					 
					
						
						
						
						
							
						
						
							154752f9c2 
							
						 
					 
					
						
						
							
							Update docs and consistency [ci skip]  
						
						
						
					 
					
						2020-09-15 00:32:49 +02:00 
						 
				 
			
				
					
						
							
							
								Ines Montani 
							
						 
					 
					
						
						
							
							
						
						
						
							
						
						
							9cc304c194 
							
						 
					 
					
						
						
							
							Merge pull request  #6064  from explosion/fix/sparse-checkout-ux  
						
						... 
						
						
						
						Fix sparse checkout and error handling 
						
					 
					
						2020-09-15 00:32:20 +02:00 
						 
				 
			
				
					
						
							
							
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							475323cd36 
							
						 
					 
					
						
						
							
							Set version to v3.0.0a18  
						
						
						
					 
					
						2020-09-14 22:05:43 +02:00 
						 
				 
			
				
					
						
							
							
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							e8378b57bc 
							
						 
					 
					
						
						
							
							Fix test  
						
						
						
					 
					
						2020-09-14 21:21:13 +02:00 
						 
				 
			
				
					
						
							
							
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							adf0bab23a 
							
						 
					 
					
						
						
							
							Merge branch 'develop' of  https://github.com/explosion/spaCy  into develop  
						
						
						
					 
					
						2020-09-14 21:04:49 +02:00 
						 
				 
			
				
					
						
							
							
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							ae15fa9688 
							
						 
					 
					
						
						
							
							Fix iob converter  
						
						
						
					 
					
						2020-09-14 21:02:18 +02:00 
						 
				 
			
				
					
						
							
							
								Sofie Van Landeghem 
							
						 
					 
					
						
						
							
							
						
						
						
							
						
						
							3216a33149 
							
						 
					 
					
						
						
							
							positive_label config for textcat ( #6062 )  
						
						... 
						
						
						
						* hook up positive_label in textcat
* unit tests
* documentation
* formatting
* tests
* fix typo
* move verify_config to after begin_training
* revert accidential commit 
						
					 
					
						2020-09-14 17:08:00 +02:00 
						 
				 
			
				
					
						
							
							
								Ines Montani 
							
						 
					 
					
						
						
						
						
							
						
						
							c052017025 
							
						 
					 
					
						
						
							
							Fix sparse checkout and error handling  
						
						
						
					 
					
						2020-09-14 14:12:58 +02:00 
						 
				 
			
				
					
						
							
							
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							fdd2340f6c 
							
						 
					 
					
						
						
							
							Set version to v3.0.0a17  
						
						
						
					 
					
						2020-09-13 23:52:03 +02:00 
						 
				 
			
				
					
						
							
							
								Ines Montani 
							
						 
					 
					
						
						
						
						
							
						
						
							416deb412f 
							
						 
					 
					
						
						
							
							Prevent duplicate traceback on CalledProcessError [ci skip]  
						
						
						
					 
					
						2020-09-13 19:28:54 +02:00 
						 
				 
			
				
					
						
							
							
								Ines Montani 
							
						 
					 
					
						
						
						
						
							
						
						
							61a4ef0b46 
							
						 
					 
					
						
						
							
							Fix syntax error  
						
						
						
					 
					
						2020-09-13 19:23:09 +02:00 
						 
				 
			
				
					
						
							
							
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							b693d2d224 
							
						 
					 
					
						
						
							
							Fix speed report in table  
						
						
						
					 
					
						2020-09-13 17:39:31 +02:00 
						 
				 
			
				
					
						
							
							
								Sofie Van Landeghem 
							
						 
					 
					
						
						
							
							
						
						
						
							
						
						
							744df9814a 
							
						 
					 
					
						
						
							
							define threshold for scoring textcat in TextCat config ( #6055 )  
						
						... 
						
						
						
						* define threshold for scoring textcat in TextCat config
* fix unit test and documentation 
						
					 
					
						2020-09-13 14:15:52 +02:00 
						 
				 
			
				
					
						
							
							
								Adriane Boyd 
							
						 
					 
					
						
						
							
							
						
						
						
							
						
						
							ab270364f1 
							
						 
					 
					
						
						
							
							Modify Token.morph to enable unsetting ( #6043 )  
						
						... 
						
						
						
						Modify `Token.morph` property so that `Token.c.morph` can be reset back
to an internal value of `0`. Allow setting `Token.morph` from a hash as
long as the morph string is already in the `StringStore`, setting it
indirectly through `Token.morph_` so that the value is added to the
morphology. If the hash is not in the `StringStore`, raise an error. 
						
					 
					
						2020-09-13 14:06:07 +02:00 
						 
				 
			
				
					
						
							
							
								Adriane Boyd 
							
						 
					 
					
						
						
							
							
						
						
						
							
						
						
							c7bd631b5f 
							
						 
					 
					
						
						
							
							Fix token.idx for special cases with affixes ( #6035 )  
						
						
						
					 
					
						2020-09-13 14:05:36 +02:00 
						 
				 
			
				
					
						
							
							
								Matthew Honnibal 
							
						 
					 
					
						
						
							
							
						
						
						
							
						
						
							54c40223a1 
							
						 
					 
					
						
						
							
							Improve v3 pretrain command ( #6040 )  
						
						... 
						
						
						
						* Starts to run
* Update pretrain script
* Update corpus
* Update pretrain schema
* Remove outdated test
* Make JsonlTexts produce Example objects. 
						
					 
					
						2020-09-13 14:05:05 +02:00 
						 
				 
			
				
					
						
							
							
								Ines Montani 
							
						 
					 
					
						
						
						
						
							
						
						
							febb99916d 
							
						 
					 
					
						
						
							
							Tidy up and auto-format [ci skip]  
						
						
						
					 
					
						2020-09-13 10:55:36 +02:00 
						 
				 
			
				
					
						
							
							
								Ines Montani 
							
						 
					 
					
						
						
						
						
							
						
						
							a5633b205f 
							
						 
					 
					
						
						
							
							Fix handling of errors around git [ci skip]  
						
						
						
					 
					
						2020-09-13 10:52:28 +02:00 
						 
				 
			
				
					
						
							
							
								Ines Montani 
							
						 
					 
					
						
						
						
						
							
						
						
							f8846c198d 
							
						 
					 
					
						
						
							
							Update types and docstrings  
						
						
						
					 
					
						2020-09-13 10:52:02 +02:00 
						 
				 
			
				
					
						
							
							
								Sofie Van Landeghem 
							
						 
					 
					
						
						
							
							
						
						
						
							
						
						
							e92e850c72 
							
						 
					 
					
						
						
							
							Raise if empty examples ( #6052 )  
						
						... 
						
						
						
						* raise error if no valid Example objects were found during initialization
* fix max_length parameter
* remove commit from other branch
Co-authored-by: Matthew Honnibal <honnibal+gh@gmail.com> 
						
					 
					
						2020-09-12 21:01:53 +02:00 
						 
				 
			
				
					
						
							
							
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							37347830d4 
							
						 
					 
					
						
						
							
							Fix reading in GloVe vectors  
						
						
						
					 
					
						2020-09-12 17:31:18 +02:00 
						 
				 
			
				
					
						
							
							
								Ines Montani 
							
						 
					 
					
						
						
							
							
						
						
						
							
						
						
							b41be87213 
							
						 
					 
					
						
						
							
							Merge pull request  #6051  from svlandeg/feature/cli-config  
						
						
						
					 
					
						2020-09-12 17:12:35 +02:00 
						 
				 
			
				
					
						
							
							
								Ines Montani 
							
						 
					 
					
						
						
						
						
							
						
						
							eedaaaec75 
							
						 
					 
					
						
						
							
							Fix handling of existing asset without checksum [ci skip]  
						
						
						
					 
					
						2020-09-12 17:02:53 +02:00 
						 
				 
			
				
					
						
							
							
								svlandeg 
							
						 
					 
					
						
						
						
						
							
						
						
							a75cfe0da6 
							
						 
					 
					
						
						
							
							Merge remote-tracking branch 'upstream/develop' into feature/cli-config  
						
						
						
					 
					
						2020-09-12 14:44:40 +02:00 
						 
				 
			
				
					
						
							
							
								svlandeg 
							
						 
					 
					
						
						
						
						
							
						
						
							115147804a 
							
						 
					 
					
						
						
							
							string_to_list to parse comma-separated string into a list  
						
						
						
					 
					
						2020-09-12 14:43:22 +02:00 
						 
				 
			
				
					
						
							
							
								Ines Montani 
							
						 
					 
					
						
						
							
							
						
						
						
							
						
						
							f886f5bbc8 
							
						 
					 
					
						
						
							
							Merge pull request  #6048  from explosion/fix/clone-compat  
						
						
						
					 
					
						2020-09-12 10:30:49 +02:00 
						 
				 
			
				
					
						
							
							
								svlandeg 
							
						 
					 
					
						
						
						
						
							
						
						
							711166a75a 
							
						 
					 
					
						
						
							
							prevent overwriting score_weights  
						
						
						
					 
					
						2020-09-11 15:12:05 +02:00 
						 
				 
			
				
					
						
							
							
								Ines Montani 
							
						 
					 
					
						
						
						
						
							
						
						
							62eec33bc4 
							
						 
					 
					
						
						
							
							Fix meta.json validation  
						
						
						
					 
					
						2020-09-11 11:38:33 +02:00 
						 
				 
			
				
					
						
							
							
								Ines Montani 
							
						 
					 
					
						
						
						
						
							
						
						
							0b2e07215d 
							
						 
					 
					
						
						
							
							Support overwriting name on spacy package  
						
						
						
					 
					
						2020-09-11 11:38:28 +02:00 
						 
				 
			
				
					
						
							
							
								svlandeg 
							
						 
					 
					
						
						
						
						
							
						
						
							5b94aeece9 
							
						 
					 
					
						
						
							
							support pipeline as "list in string"  
						
						
						
					 
					
						2020-09-11 11:08:46 +02:00 
						 
				 
			
				
					
						
							
							
								Ines Montani 
							
						 
					 
					
						
						
						
						
							
						
						
							1bce432b4a 
							
						 
					 
					
						
						
							
							Adjust message [ci skip]  
						
						
						
					 
					
						2020-09-11 10:00:49 +02:00 
						 
				 
			
				
					
						
							
							
								Ines Montani 
							
						 
					 
					
						
						
						
						
							
						
						
							5acd4fbcd8 
							
						 
					 
					
						
						
							
							Merge branch 'develop' into fix/clone-compat  
						
						
						
					 
					
						2020-09-11 09:58:30 +02:00 
						 
				 
			
				
					
						
							
							
								Ines Montani 
							
						 
					 
					
						
						
						
						
							
						
						
							761bd60d43 
							
						 
					 
					
						
						
							
							Adjust info message  
						
						
						
					 
					
						2020-09-11 09:57:00 +02:00 
						 
				 
			
				
					
						
							
							
								Ines Montani 
							
						 
					 
					
						
						
						
						
							
						
						
							6831161bfa 
							
						 
					 
					
						
						
							
							Resolve path to be extra sure  
						
						
						
					 
					
						2020-09-11 09:56:49 +02:00 
						 
				 
			
				
					
						
							
							
								svlandeg 
							
						 
					 
					
						
						
						
						
							
						
						
							1723fb73c4 
							
						 
					 
					
						
						
							
							remove brol  
						
						
						
					 
					
						2020-09-10 17:44:59 +02:00 
						 
				 
			
				
					
						
							
							
								svlandeg 
							
						 
					 
					
						
						
						
						
							
						
						
							08a831ce83 
							
						 
					 
					
						
						
							
							process trailing slash if any  
						
						
						
					 
					
						2020-09-10 17:39:52 +02:00 
						 
				 
			
				
					
						
							
							
								Ines Montani 
							
						 
					 
					
						
						
						
						
							
						
						
							3e83a509bb 
							
						 
					 
					
						
						
							
							WIP: fix project clone compatibility  
						
						
						
					 
					
						2020-09-10 15:49:13 +02:00 
						 
				 
			
				
					
						
							
							
								svlandeg 
							
						 
					 
					
						
						
						
						
							
						
						
							f1bc09c1e9 
							
						 
					 
					
						
						
							
							restore partly  
						
						
						
					 
					
						2020-09-10 14:53:02 +02:00 
						 
				 
			
				
					
						
							
							
								svlandeg 
							
						 
					 
					
						
						
						
						
							
						
						
							3889747119 
							
						 
					 
					
						
						
							
							asset fix & UX  
						
						
						
					 
					
						2020-09-10 14:36:53 +02:00 
						 
				 
			
				
					
						
							
							
								svlandeg 
							
						 
					 
					
						
						
						
						
							
						
						
							a36766d153 
							
						 
					 
					
						
						
							
							hookup branch  
						
						
						
					 
					
						2020-09-10 12:00:34 +02:00 
						 
				 
			
				
					
						
							
							
								svlandeg 
							
						 
					 
					
						
						
						
						
							
						
						
							97d99f7efa 
							
						 
					 
					
						
						
							
							Merge remote-tracking branch 'upstream/develop' into feature/doc-fixes  
						
						
						
					 
					
						2020-09-10 11:51:34 +02:00 
						 
				 
			
				
					
						
							
							
								Ines Montani 
							
						 
					 
					
						
						
						
						
							
						
						
							908f3a4494 
							
						 
					 
					
						
						
							
							Update default projects repo [ci skip]  
						
						
						
					 
					
						2020-09-10 11:42:14 +02:00 
						 
				 
			
				
					
						
							
							
								svlandeg 
							
						 
					 
					
						
						
						
						
							
						
						
							92f9d2f406 
							
						 
					 
					
						
						
							
							small UX fixes  
						
						
						
					 
					
						2020-09-10 11:35:50 +02:00 
						 
				 
			
				
					
						
							
							
								svlandeg 
							
						 
					 
					
						
						
						
						
							
						
						
							1fc5486792 
							
						 
					 
					
						
						
							
							more fine-grained errors for git_sparse_checkout  
						
						
						
					 
					
						2020-09-10 11:31:32 +02:00 
						 
				 
			
				
					
						
							
							
								Ines Montani 
							
						 
					 
					
						
						
						
						
							
						
						
							15bc3a37b4 
							
						 
					 
					
						
						
							
							Add --branch to project clone  
						
						
						
					 
					
						2020-09-10 11:08:15 +02:00 
						 
				 
			
				
					
						
							
							
								Ines Montani 
							
						 
					 
					
						
						
							
							
						
						
						
							
						
						
							1955aaaa20 
							
						 
					 
					
						
						
							
							Merge pull request  #6045  from svlandeg/feature/more-layers-docs [ci skip]  
						
						
						
					 
					
						2020-09-09 21:46:40 +02:00 
						 
				 
			
				
					
						
							
							
								Sofie Van Landeghem 
							
						 
					 
					
						
						
							
							
						
						
						
							
						
						
							cb66ea7400 
							
						 
					 
					
						
						
							
							Remove simple_ner code ( #6041 )  
						
						... 
						
						
						
						* remove simple_ner code
* remove unused _biluo and _iob files 
						
					 
					
						2020-09-09 16:11:27 +02:00 
						 
				 
			
				
					
						
							
							
								svlandeg 
							
						 
					 
					
						
						
						
						
							
						
						
							39aa740777 
							
						 
					 
					
						
						
							
							Merge remote-tracking branch 'upstream/develop' into feature/more-layers-docs  
						
						
						
					 
					
						2020-09-09 11:59:34 +02:00 
						 
				 
			
				
					
						
							
							
								Sofie Van Landeghem 
							
						 
					 
					
						
						
							
							
						
						
						
							
						
						
							8e7557656f 
							
						 
					 
					
						
						
							
							Renaming gold & annotation_setter ( #6042 )  
						
						... 
						
						
						
						* version bump to 3.0.0a16
* rename "gold" folder to "training"
* rename 'annotation_setter' to 'set_extra_annotations'
* formatting 
						
					 
					
						2020-09-09 10:31:03 +02:00 
						 
				 
			
				
					
						
							
							
								Sofie Van Landeghem 
							
						 
					 
					
						
						
							
							
						
						
						
							
						
						
							60f22e1800 
							
						 
					 
					
						
						
							
							Pipe API ( #6034 )  
						
						... 
						
						
						
						* ensure Language passes on valid examples for initialization
* fix tagger model initialization
* check for valid get_examples across components
* assume labels were added before begin_training
* fix senter initialization
* fix morphologizer initialization
* use methods to check arguments
* test textcat init, requires thinc>=8.0.0a31
* fix tok2vec init
* fix entity linker init
* use islice
* fix simple NER
* cleanup debug model
* fix assert statements
* fix tests
* throw error when adding a label if the output layer can't be resized anymore
* fix test
* add failing test for simple_ner
* UX improvements
* morphologizer UX
* assume begin_training gets a representative set and processes the labels
* remove assumptions for output of untrained NER model
* restore test for original purpose 
						
					 
					
						2020-09-08 22:44:25 +02:00 
						 
				 
			
				
					
						
							
							
								svlandeg 
							
						 
					 
					
						
						
						
						
							
						
						
							d0a8849e4d 
							
						 
					 
					
						
						
							
							fix typo  
						
						
						
					 
					
						2020-09-08 18:32:12 +02:00 
						 
				 
			
				
					
						
							
							
								svlandeg 
							
						 
					 
					
						
						
						
						
							
						
						
							bd8f9b188b 
							
						 
					 
					
						
						
							
							small fixes  
						
						
						
					 
					
						2020-09-08 17:24:36 +02:00 
						 
				 
			
				
					
						
							
							
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							4b82882767 
							
						 
					 
					
						
						
							
							Fix defaults  
						
						
						
					 
					
						2020-09-08 15:31:21 +02:00 
						 
				 
			
				
					
						
							
							
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							5d09e3e154 
							
						 
					 
					
						
						
							
							Set version to v3.0.0a15  
						
						
						
					 
					
						2020-09-08 15:25:10 +02:00 
						 
				 
			
				
					
						
							
							
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							ba5f4c9b32 
							
						 
					 
					
						
						
							
							Add words and seconds to train info  
						
						
						
					 
					
						2020-09-08 15:24:47 +02:00 
						 
				 
			
				
					
						
							
							
								Matthew Honnibal 
							
						 
					 
					
						
						
							
							
						
						
						
							
						
						
							b470062153 
							
						 
					 
					
						
						
							
							Add CLI registry ( #6037 )  
						
						
						
					 
					
						2020-09-08 15:23:34 +02:00 
						 
				 
			
				
					
						
							
							
								svlandeg 
							
						 
					 
					
						
						
						
						
							
						
						
							06ef66fd73 
							
						 
					 
					
						
						
							
							Merge remote-tracking branch 'upstream/develop' into feature/more-layers-docs  
						
						
						
					 
					
						2020-09-08 10:28:42 +02:00 
						 
				 
			
				
					
						
							
							
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							dae22f3dfa 
							
						 
					 
					
						
						
							
							Fix ignoring of punct labels  
						
						
						
					 
					
						2020-09-05 14:11:59 +02:00 
						 
				 
			
				
					
						
							
							
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							12e1279f6b 
							
						 
					 
					
						
						
							
							Set version to v3.0.0a14  
						
						
						
					 
					
						2020-09-05 04:13:53 +02:00 
						 
				 
			
				
					
						
							
							
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							4b7abaafdb 
							
						 
					 
					
						
						
							
							Fix learn rate for non-transformer  
						
						
						
					 
					
						2020-09-04 21:22:50 +02:00 
						 
				 
			
				
					
						
							
							
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							465785a672 
							
						 
					 
					
						
						
							
							Fix project pull and push  
						
						
						
					 
					
						2020-09-04 21:15:55 +02:00 
						 
				 
			
				
					
						
							
							
								Ines Montani 
							
						 
					 
					
						
						
						
						
							
						
						
							f174c7b1f3 
							
						 
					 
					
						
						
							
							Merge branch 'develop' into pr/6018  
						
						
						
					 
					
						2020-09-04 15:54:49 +02:00 
						 
				 
			
				
					
						
							
							
								Ines Montani 
							
						 
					 
					
						
						
							
							
						
						
						
							
						
						
							f06eed800e 
							
						 
					 
					
						
						
							
							Merge pull request  #6029  from explosion/master-tmp  
						
						
						
					 
					
						2020-09-04 15:11:55 +02:00 
						 
				 
			
				
					
						
							
							
								Ines Montani 
							
						 
					 
					
						
						
						
						
							
						
						
							f9550b4493 
							
						 
					 
					
						
						
							
							Fix components in meta.json and website [ci skip]  
						
						
						
					 
					
						2020-09-04 14:42:12 +02:00 
						 
				 
			
				
					
						
							
							
								Ines Montani 
							
						 
					 
					
						
						
						
						
							
						
						
							d7cc2ee72d 
							
						 
					 
					
						
						
							
							Fix tests  
						
						
						
					 
					
						2020-09-04 14:05:55 +02:00 
						 
				 
			
				
					
						
							
							
								Ines Montani 
							
						 
					 
					
						
						
						
						
							
						
						
							90043a6f9b 
							
						 
					 
					
						
						
							
							Tidy up and auto-format  
						
						
						
					 
					
						2020-09-04 13:42:33 +02:00 
						 
				 
			
				
					
						
							
							
								Ines Montani 
							
						 
					 
					
						
						
						
						
							
						
						
							df0b68f60e 
							
						 
					 
					
						
						
							
							Remove unicode declarations and update language data  
						
						
						
					 
					
						2020-09-04 13:19:16 +02:00 
						 
				 
			
				
					
						
							
							
								Ines Montani 
							
						 
					 
					
						
						
						
						
							
						
						
							ba600f91c5 
							
						 
					 
					
						
						
							
							Tidy up imports  
						
						
						
					 
					
						2020-09-04 13:15:44 +02:00 
						 
				 
			
				
					
						
							
							
								Ines Montani 
							
						 
					 
					
						
						
						
						
							
						
						
							864a697e63 
							
						 
					 
					
						
						
							
							Merge branch 'develop' into master-tmp  
						
						
						
					 
					
						2020-09-04 13:15:36 +02:00 
						 
				 
			
				
					
						
							
							
								Adriane Boyd 
							
						 
					 
					
						
						
							
							
						
						
						
							
						
						
							b927893309 
							
						 
					 
					
						
						
							
							Merge branch 'develop' into feature/dependency-matcher-v3  
						
						
						
					 
					
						2020-09-04 13:03:30 +02:00 
						 
				 
			
				
					
						
							
							
								Ines Montani 
							
						 
					 
					
						
						
						
						
							
						
						
							ab1bb421ed 
							
						 
					 
					
						
						
							
							Update docs links in codebase  
						
						
						
					 
					
						2020-09-04 12:58:50 +02:00 
						 
				 
			
				
					
						
							
							
								holubvl3 
							
						 
					 
					
						
						
							
							
						
						
						
							
						
						
							0a27fca557 
							
						 
					 
					
						
						
							
							Create examples.py ( #5985 )  
						
						... 
						
						
						
						* Create examples.py
* Create tag_map.py
* Delete tag_map.py
* Update examples.py
formatting: add empty line
Co-authored-by: Sofie Van Landeghem <svlandeg@users.noreply.github.com> 
						
					 
					
						2020-09-04 11:00:14 +02:00 
						 
				 
			
				
					
						
							
							
								Ines Montani 
							
						 
					 
					
						
						
							
							
						
						
						
							
						
						
							2189046869 
							
						 
					 
					
						
						
							
							Merge pull request  #6024  from explosion/chore/registry-renaming  
						
						
						
					 
					
						2020-09-04 10:54:10 +02:00 
						 
				 
			
				
					
						
							
							
								svlandeg 
							
						 
					 
					
						
						
						
						
							
						
						
							c32fcdf4c9 
							
						 
					 
					
						
						
							
							fix typo  
						
						
						
					 
					
						2020-09-04 09:10:21 +02:00 
						 
				 
			
				
					
						
							
							
								Ines Montani 
							
						 
					 
					
						
						
						
						
							
						
						
							595f9dc2e4 
							
						 
					 
					
						
						
							
							Make displacy color registry consistent with others  
						
						... 
						
						
						
						This was the only registry that expected the registered objects to be dictionaries instead of functions that return something. We can still support plain dicts but we should also support functions for consistency 
						
					 
					
						2020-09-03 23:05:41 +02:00 
						 
				 
			
				
					
						
							
							
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							1c07820681 
							
						 
					 
					
						
						
							
							Merge branch 'develop' of  https://github.com/explosion/spaCy  into develop  
						
						
						
					 
					
						2020-09-03 18:54:21 +02:00 
						 
				 
			
				
					
						
							
							
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							7be8a0516a 
							
						 
					 
					
						
						
							
							Fix project pull  
						
						
						
					 
					
						2020-09-03 18:54:03 +02:00 
						 
				 
			
				
					
						
							
							
								Ines Montani 
							
						 
					 
					
						
						
						
						
							
						
						
							23b7d9cfa3 
							
						 
					 
					
						
						
							
							Prefix span getters  
						
						
						
					 
					
						2020-09-03 17:37:06 +02:00 
						 
				 
			
				
					
						
							
							
								Ines Montani 
							
						 
					 
					
						
						
						
						
							
						
						
							5afe6447cd 
							
						 
					 
					
						
						
							
							registry.assets -> registry.misc  
						
						
						
					 
					
						2020-09-03 17:31:14 +02:00 
						 
				 
			
				
					
						
							
							
								Ines Montani 
							
						 
					 
					
						
						
						
						
							
						
						
							c063e55eb7 
							
						 
					 
					
						
						
							
							Add prefix to batchers  
						
						
						
					 
					
						2020-09-03 17:30:41 +02:00 
						 
				 
			
				
					
						
							
							
								Ines Montani 
							
						 
					 
					
						
						
							
							
						
						
						
							
						
						
							896caf45e3 
							
						 
					 
					
						
						
							
							Merge pull request  #6023  from explosion/ux/model-terminology-consistency [ci skip]  
						
						
						
					 
					
						2020-09-03 17:13:44 +02:00 
						 
				 
			
				
					
						
							
							
								Ines Montani 
							
						 
					 
					
						
						
						
						
							
						
						
							c53b1433b9 
							
						 
					 
					
						
						
							
							Adjust more arguments [ci skip]  
						
						
						
					 
					
						2020-09-03 17:12:24 +02:00 
						 
				 
			
				
					
						
							
							
								Ines Montani 
							
						 
					 
					
						
						
						
						
							
						
						
							b5a0657fd6 
							
						 
					 
					
						
						
							
							"model" terminology consistency in docs  
						
						
						
					 
					
						2020-09-03 13:13:03 +02:00 
						 
				 
			
				
					
						
							
							
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							f038841798 
							
						 
					 
					
						
						
							
							Merge branch 'develop' of  https://github.com/explosion/spaCy  into develop  
						
						
						
					 
					
						2020-09-03 12:52:39 +02:00 
						 
				 
			
				
					
						
							
							
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							ef0d0630a4 
							
						 
					 
					
						
						
							
							Let Langugae.use_params work with falsey inputs  
						
						... 
						
						
						
						The Language.use_params method was failing if you passed in None, which
meant we had to use awkward conditionals for the parameter averaging.
This solves the problem. 
						
					 
					
						2020-09-03 12:51:04 +02:00 
						 
				 
			
				
					
						
							
							
								Yohei Tamura 
							
						 
					 
					
						
						
							
							
						
						
						
							
						
						
							5af432e0f2 
							
						 
					 
					
						
						
							
							fix for empty string ( #5936 )  
						
						
						
					 
					
						2020-09-03 10:09:03 +02:00 
						 
				 
			
				
					
						
							
							
								Adriane Boyd 
							
						 
					 
					
						
						
							
							
						
						
						
							
						
						
							77ac4a38aa 
							
						 
					 
					
						
						
							
							Simplify specials and cache checks ( #6012 )  
						
						
						
					 
					
						2020-09-03 09:42:49 +02:00 
						 
				 
			
				
					
						
							
							
								Adriane Boyd 
							
						 
					 
					
						
						
						
						
							
						
						
							8b5594df86 
							
						 
					 
					
						
						
							
							Remove near-duplicate test  
						
						
						
					 
					
						2020-09-02 20:32:01 +02:00 
						 
				 
			
				
					
						
							
							
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							122cb02001 
							
						 
					 
					
						
						
							
							Fix averages  
						
						
						
					 
					
						2020-09-02 19:37:43 +02:00 
						 
				 
			
				
					
						
							
							
								Adriane Boyd 
							
						 
					 
					
						
						
						
						
							
						
						
							960d9cfadc 
							
						 
					 
					
						
						
							
							Officially support DependencyMatcher  
						
						... 
						
						
						
						Add official support for the `DependencyMatcher`. Redesign the pattern
specification. Fix and extend operator implementations. Update API docs
and add usage docs.
Patterns
--------
Refactor pattern structure to:
```
{
  "LEFT_ID": str,
  "REL_OP": str,
  "RIGHT_ID": str,
  "RIGHT_ATTRS": dict,
}
```
The first node contains only `RIGHT_ID` and `RIGHT_ATTRS` and all
subsequent nodes contain all four keys.
New operators
-------------
Because of the way patterns are constructed from left to right, it's
helpful to have `follows` operators along with `precedes` operators. Add
operators for simple precedes / follows alongside immediate precedes /
follows.
* `.*`: precedes
* `;`: immediately follows
* `;*`: follows
Operator fixes
--------------
* `<` and `<<` do not include the node itself
* Fix reversed order for all operators involving linear precedence (`.`,
  all sibling operators)
* Linear precedence operators do not match nodes outside the same parse
Additional fixes
----------------
* Use v3 Matcher API
* Support `get` and `remove`
* Support pickling 
						
					 
					
						2020-09-02 17:45:29 +02:00 
						 
				 
			
				
					
						
							
							
								Marek Grzenkowicz 
							
						 
					 
					
						
						
							
							
						
						
						
							
						
						
							92d7832a86 
							
						 
					 
					
						
						
							
							Fix off-by-one error for best iteration calculation ( closes   #6014 ) ( #6016 )  
						
						
						
					 
					
						2020-09-02 15:15:45 +02:00 
						 
				 
			
				
					
						
							
							
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							737a1408d9 
							
						 
					 
					
						
						
							
							Improve implementation of  fix   #6010  
						
						... 
						
						
						
						Follow-ups to the parser efficiency fix.
* Avoid introducing new counter for number of pushes
* Base cut on number of transitions, keeping it more even
* Reintroduce the randomization we had in v2. 
						
					 
					
						2020-09-02 14:42:32 +02:00 
						 
				 
			
				
					
						
							
							
								Sofie Van Landeghem 
							
						 
					 
					
						
						
							
							
						
						
						
							
						
						
							eb56377799 
							
						 
					 
					
						
						
							
							Fix overfitting test ( #6011 )  
						
						... 
						
						
						
						* remove unused MORPH_RULES
* fix textcat architecture in overfitting test 
						
					 
					
						2020-09-02 13:07:41 +02:00 
						 
				 
			
				
					
						
							
							
								Adriane Boyd 
							
						 
					 
					
						
						
							
							
						
						
						
							
						
						
							b97d98783a 
							
						 
					 
					
						
						
							
							Fix Hungarian % tokenization ( #6013 )  
						
						
						
					 
					
						2020-09-02 13:06:16 +02:00 
						 
				 
			
				
					
						
							
							
								Matthew Honnibal 
							
						 
					 
					
						
						
							
							
						
						
						
							
						
						
							c1bf3a5602 
							
						 
					 
					
						
						
							
							Fix significant performance bug in parser training ( #6010 )  
						
						... 
						
						
						
						The parser training makes use of a trick for long documents, where we
use the oracle to cut up the document into sections, so that we can have
batch items in the middle of a document. For instance, if we have one
document of 600 words, we might make 6 states, starting at words 0, 100,
200, 300, 400 and 500.
The problem is for v3, I screwed this up and didn't stop parsing! So
instead of a batch of [100, 100, 100, 100, 100, 100], we'd have a batch
of [600, 500, 400, 300, 200, 100]. Oops.
The implementation here could probably be improved, it's annoying to
have this extra variable in the state. But this'll do.
This makes the v3 parser training 5-10 times faster, depending on document
lengths. This problem wasn't in v2. 
						
					 
					
						2020-09-02 12:57:13 +02:00 
						 
				 
			
				
					
						
							
							
								Sofie Van Landeghem 
							
						 
					 
					
						
						
							
							
						
						
						
							
						
						
							f7a25d69f7 
							
						 
					 
					
						
						
							
							Bugfix in merge_entities ( #6005 )  
						
						... 
						
						
						
						* failing test
* bugfix 
						
					 
					
						2020-09-01 21:57:52 +02:00 
						 
				 
			
				
					
						
							
							
								Sofie Van Landeghem 
							
						 
					 
					
						
						
							
							
						
						
						
							
						
						
							6bfb1b3a29 
							
						 
					 
					
						
						
							
							Fix sparse checkout for 'spacy project' ( #6008 )  
						
						... 
						
						
						
						* exit if cloning fails
* UX
* rewrite http link to git protocol, don't use stdin
* fixes to sparse checkout
* formatting 
						
					 
					
						2020-09-01 19:49:01 +02:00 
						 
				 
			
				
					
						
							
							
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							4cce32f090 
							
						 
					 
					
						
						
							
							Fix tagger initialization  
						
						
						
					 
					
						2020-09-01 16:38:34 +02:00 
						 
				 
			
				
					
						
							
							
								Matthew Honnibal 
							
						 
					 
					
						
						
							
							
						
						
						
							
						
						
							046c38bd26 
							
						 
					 
					
						
						
							
							Remove 'cleanup' of strings ( #6007 )  
						
						... 
						
						
						
						A long time ago we went to some trouble to try to clean up "unused"
strings, to avoid the `StringStore` growing in long-running processes.
This never really worked reliably, and I think it was a really wrong
approach. It's much better to let the user reload the `nlp` object as
necessary, now that the string encoding is stable (in v1, the string IDs
were sequential integers, making reloading the NLP object really
annoying.)
The extra book-keeping does make some performance difference, and the
feature is unsed, so it's past time we killed it. 
						
					 
					
						2020-09-01 16:12:15 +02:00 
						 
				 
			
				
					
						
							
							
								Ines Montani 
							
						 
					 
					
						
						
						
						
							
						
						
							70b226f69d 
							
						 
					 
					
						
						
							
							Support ignore marker in project document [ci skip]  
						
						
						
					 
					
						2020-09-01 12:49:04 +02:00 
						 
				 
			
				
					
						
							
							
								Ines Montani 
							
						 
					 
					
						
						
						
						
							
						
						
							a4c51f0f18 
							
						 
					 
					
						
						
							
							Add v3 info to project docs [ci skip]  
						
						
						
					 
					
						2020-09-01 12:36:21 +02:00 
						 
				 
			
				
					
						
							
							
								Ines Montani 
							
						 
					 
					
						
						
						
						
							
						
						
							ef9005273b 
							
						 
					 
					
						
						
							
							Update fill-config command and add silent mode [ci skip]  
						
						
						
					 
					
						2020-09-01 12:07:04 +02:00 
						 
				 
			
				
					
						
							
							
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							ec660e3131 
							
						 
					 
					
						
						
							
							Fix use_pytorch_for_gpu_memory  
						
						
						
					 
					
						2020-09-01 00:41:38 +02:00 
						 
				 
			
				
					
						
							
							
								Adriane Boyd 
							
						 
					 
					
						
						
							
							
						
						
						
							
						
						
							9130094199 
							
						 
					 
					
						
						
							
							Prevent Tagger model init with 0 labels ( #5984 )  
						
						... 
						
						
						
						* Prevent Tagger model init with 0 labels
Raise an error before trying to initialize a tagger model with 0 labels.
* Add dummy tagger label for test
* Remove tagless tagger model initializiation
* Fix error number after merge
* Add dummy tagger label to test
* Fix formatting
Co-authored-by: Matthew Honnibal <honnibal+gh@gmail.com> 
						
					 
					
						2020-08-31 21:24:33 +02:00 
						 
				 
			
				
					
						
							
							
								Matthw Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							c38298b8fa 
							
						 
					 
					
						
						
							
							Merge branch 'develop' of  https://github.com/explosion/spaCy  into develop  
						
						
						
					 
					
						2020-08-31 19:55:55 +02:00 
						 
				 
			
				
					
						
							
							
								Matthw Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							fe298fa50a 
							
						 
					 
					
						
						
							
							Shuffle on first epoch of train  
						
						
						
					 
					
						2020-08-31 19:55:22 +02:00 
						 
				 
			
				
					
						
							
							
								Ines Montani 
							
						 
					 
					
						
						
							
							
						
						
						
							
						
						
							9af82f3f11 
							
						 
					 
					
						
						
							
							Merge pull request  #6003  from explosion/feature/matcher-as-spans  
						
						
						
					 
					
						2020-08-31 17:50:56 +02:00 
						 
				 
			
				
					
						
							
							
								Ines Montani 
							
						 
					 
					
						
						
						
						
							
						
						
							add9de5487 
							
						 
					 
					
						
						
							
							Deprecate (Phrase)Matcher.pipe  
						
						
						
					 
					
						2020-08-31 17:01:24 +02:00 
						 
				 
			
				
					
						
							
							
								Ines Montani 
							
						 
					 
					
						
						
							
							
						
						
						
							
						
						
							83aff38c59 
							
						 
					 
					
						
						
							
							Make argument keyword-only  
						
						... 
						
						
						
						Co-authored-by: Matthew Honnibal <honnibal+gh@gmail.com> 
						
					 
					
						2020-08-31 15:39:03 +02:00 
						 
				 
			
				
					
						
							
							
								Ines Montani 
							
						 
					 
					
						
						
						
						
							
						
						
							6340d1c63d 
							
						 
					 
					
						
						
							
							Add as_spans to Matcher/PhraseMatcher  
						
						
						
					 
					
						2020-08-31 14:53:22 +02:00 
						 
				 
			
				
					
						
							
							
								svlandeg 
							
						 
					 
					
						
						
						
						
							
						
						
							13ee742fb4 
							
						 
					 
					
						
						
							
							example of custom logger  
						
						
						
					 
					
						2020-08-31 14:24:41 +02:00 
						 
				 
			
				
					
						
							
							
								svlandeg 
							
						 
					 
					
						
						
						
						
							
						
						
							c18eb63483 
							
						 
					 
					
						
						
							
							Merge remote-tracking branch 'upstream/develop' into feature/vectors-docs  
						
						... 
						
						
						
						# Conflicts:
#	website/docs/usage/embeddings-transformers.md 
						
					 
					
						2020-08-31 13:21:36 +02:00 
						 
				 
			
				
					
						
							
							
								Sofie Van Landeghem 
							
						 
					 
					
						
						
							
							
						
						
						
							
						
						
							ec14744ee4 
							
						 
					 
					
						
						
							
							Rename Transformer listener ( #6001 )  
						
						... 
						
						
						
						* rename to spacy-transformers.TransformerListener
* add some more tok2vec tests
* use select_pipes
* fix docs - annotation setter was not changed in the end 
						
					 
					
						2020-08-31 12:41:39 +02:00 
						 
				 
			
				
					
						
							
							
								Adriane Boyd 
							
						 
					 
					
						
						
						
						
							
						
						
							216efaf5f5 
							
						 
					 
					
						
						
							
							Restrict tokenizer exceptions to ORTH and NORM  
						
						
						
					 
					
						2020-08-31 09:55:01 +02:00 
						 
				 
			
				
					
						
							
							
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							9341cbc013 
							
						 
					 
					
						
						
							
							Set version to v3.0.0a13  
						
						
						
					 
					
						2020-08-30 23:10:43 +02:00 
						 
				 
			
				
					
						
							
							
								Ines Montani 
							
						 
					 
					
						
						
							
							
						
						
						
							
						
						
							45f46a5c85 
							
						 
					 
					
						
						
							
							Merge pull request  #5993  from explosion/feature/disabled-components  
						
						
						
					 
					
						2020-08-29 15:58:41 +02:00 
						 
				 
			
				
					
						
							
							
								Ines Montani 
							
						 
					 
					
						
						
						
						
							
						
						
							34146750d4 
							
						 
					 
					
						
						
							
							Use frozen list with custom errors  
						
						... 
						
						
						
						We don't want to break backwards compatibility too much but we also want to provide the best possible UX 
						
					 
					
						2020-08-29 15:20:11 +02:00 
						 
				 
			
				
					
						
							
							
								Ines Montani 
							
						 
					 
					
						
						
							
							
						
						
						
							
						
						
							744f432420 
							
						 
					 
					
						
						
							
							Merge pull request  #5994  from explosion/feature/idempotent-component-decorator  
						
						
						
					 
					
						2020-08-29 13:17:13 +02:00 
						 
				 
			
				
					
						
							
							
								Ines Montani 
							
						 
					 
					
						
						
							
							
						
						
						
							
						
						
							5de3f8604d 
							
						 
					 
					
						
						
							
							Update spacy/util.py  
						
						... 
						
						
						
						Co-authored-by: Matthew Honnibal <honnibal+gh@gmail.com> 
						
					 
					
						2020-08-29 13:17:06 +02:00 
						 
				 
			
				
					
						
							
							
								Ines Montani 
							
						 
					 
					
						
						
						
						
							
						
						
							091a9b522a 
							
						 
					 
					
						
						
							
							Remove unused variable [ci skip]  
						
						
						
					 
					
						2020-08-29 13:11:26 +02:00 
						 
				 
			
				
					
						
							
							
								Ines Montani 
							
						 
					 
					
						
						
						
						
							
						
						
							2bc31e15c9 
							
						 
					 
					
						
						
							
							Tidy up and auto-format [ci skip]  
						
						
						
					 
					
						2020-08-29 13:01:10 +02:00 
						 
				 
			
				
					
						
							
							
								Ines Montani 
							
						 
					 
					
						
						
						
						
							
						
						
							6520d1a1df 
							
						 
					 
					
						
						
							
							Work around set order in Language.disabled  
						
						
						
					 
					
						2020-08-29 12:58:22 +02:00 
						 
				 
			
				
					
						
							
							
								Ines Montani 
							
						 
					 
					
						
						
							
							
						
						
						
							
						
						
							f45095a666 
							
						 
					 
					
						
						
							
							Merge pull request  #5995  from adrianeboyd/bugfix/attribute-ruler-bugfixes  
						
						
						
					 
					
						2020-08-29 12:38:30 +02:00 
						 
				 
			
				
					
						
							
							
								Ines Montani 
							
						 
					 
					
						
						
						
						
							
						
						
							e0b4984aa4 
							
						 
					 
					
						
						
							
							Make deprecated disable_pipes call into select_pipes  
						
						
						
					 
					
						2020-08-29 12:08:46 +02:00 
						 
				 
			
				
					
						
							
							
								Ines Montani 
							
						 
					 
					
						
						
						
						
							
						
						
							15d73f4dc3 
							
						 
					 
					
						
						
							
							Make user-facing Language.disabled return list  
						
						... 
						
						
						
						More consistent with all the other properties 
						
					 
					
						2020-08-29 12:08:33 +02:00 
						 
				 
			
				
					
						
							
							
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							58f19421b1 
							
						 
					 
					
						
						
							
							Return empty batch from tok2vec listener if no doc.tensor  
						
						
						
					 
					
						2020-08-29 03:46:50 +02:00 
						 
				 
			
				
					
						
							
							
								svlandeg 
							
						 
					 
					
						
						
						
						
							
						
						
							5230529de2 
							
						 
					 
					
						
						
							
							add loggers registry & logger docs sections  
						
						
						
					 
					
						2020-08-28 21:44:04 +02:00 
						 
				 
			
				
					
						
							
							
								Ines Montani 
							
						 
					 
					
						
						
						
						
							
						
						
							0687d7148e 
							
						 
					 
					
						
						
							
							Rename user-facing API  
						
						
						
					 
					
						2020-08-28 21:04:02 +02:00 
						 
				 
			
				
					
						
							
							
								Adriane Boyd 
							
						 
					 
					
						
						
						
						
							
						
						
							0104bd1600 
							
						 
					 
					
						
						
							
							Sort the AttributeRuler matches by rule order  
						
						... 
						
						
						
						Sort the returned matches by rule order (the `match_id`) so that the
rules are applied in the order they were added. This is necessary, for
instance, if the `AttributeRuler` is used for the tag map and later
rules require POS tags. 
						
					 
					
						2020-08-28 21:01:06 +02:00 
						 
				 
			
				
					
						
							
							
								Ines Montani 
							
						 
					 
					
						
						
						
						
							
						
						
							6a999c9303 
							
						 
					 
					
						
						
							
							Remove outdated component attr check  
						
						
						
					 
					
						2020-08-28 20:59:19 +02:00 
						 
				 
			
				
					
						
							
							
								Adriane Boyd 
							
						 
					 
					
						
						
						
						
							
						
						
							8674b17651 
							
						 
					 
					
						
						
							
							Serialize AttributeRuler.patterns  
						
						... 
						
						
						
						Serialize `AttributeRuler.patterns` instead of the individual lists to
simplify the serialized and so that patterns are reloaded exactly as
they were originally provided (preserving `_attrs_unnormed`). 
						
					 
					
						2020-08-28 20:44:45 +02:00 
						 
				 
			
				
					
						
							
							
								Ines Montani 
							
						 
					 
					
						
						
						
						
							
						
						
							10da74382f 
							
						 
					 
					
						
						
							
							Raise if disabled components are removed before DisabledPipes.restore  
						
						
						
					 
					
						2020-08-28 20:35:26 +02:00 
						 
				 
			
				
					
						
							
							
								Ines Montani 
							
						 
					 
					
						
						
						
						
							
						
						
							1e0363290e 
							
						 
					 
					
						
						
							
							Remove todos and update docstrings  
						
						
						
					 
					
						2020-08-28 20:34:46 +02:00 
						 
				 
			
				
					
						
							
							
								Ines Montani 
							
						 
					 
					
						
						
						
						
							
						
						
							cad988da7f 
							
						 
					 
					
						
						
							
							Allow component decorators to re-run with same function  
						
						
						
					 
					
						2020-08-28 16:27:22 +02:00 
						 
				 
			
				
					
						
							
							
								Ines Montani 
							
						 
					 
					
						
						
						
						
							
						
						
							3ce5be4b76 
							
						 
					 
					
						
						
							
							Allow loaded but disabled components  
						
						
						
					 
					
						2020-08-28 15:20:14 +02:00 
						 
				 
			
				
					
						
							
							
								Ines Montani 
							
						 
					 
					
						
						
							
							
						
						
						
							
						
						
							89f692bc8a 
							
						 
					 
					
						
						
							
							Merge pull request  #5992  from svlandeg/feature/wandb-restrict-config  
						
						
						
					 
					
						2020-08-28 15:05:29 +02:00 
						 
				 
			
				
					
						
							
							
								Ines Montani 
							
						 
					 
					
						
						
							
							
						
						
						
							
						
						
							9c4049b57f 
							
						 
					 
					
						
						
							
							Merge pull request  #5986  from explosion/fix/language-config-interpolate-disk-bytes  
						
						
						
					 
					
						2020-08-28 15:03:52 +02:00 
						 
				 
			
				
					
						
							
							
								Ines Montani 
							
						 
					 
					
						
						
						
						
							
						
						
							adc050cdc5 
							
						 
					 
					
						
						
							
							Fix code style in test [ci skip]  
						
						
						
					 
					
						2020-08-28 15:03:21 +02:00 
						 
				 
			
				
					
						
							
							
								svlandeg 
							
						 
					 
					
						
						
						
						
							
						
						
							05a1bafa15 
							
						 
					 
					
						
						
							
							fix type  
						
						
						
					 
					
						2020-08-28 14:08:33 +02:00 
						 
				 
			
				
					
						
							
							
								svlandeg 
							
						 
					 
					
						
						
						
						
							
						
						
							33883aa764 
							
						 
					 
					
						
						
							
							rename field  
						
						
						
					 
					
						2020-08-28 14:06:23 +02:00 
						 
				 
			
				
					
						
							
							
								svlandeg 
							
						 
					 
					
						
						
						
						
							
						
						
							1d8c4070aa 
							
						 
					 
					
						
						
							
							add disable_fields to wandb_logger  
						
						
						
					 
					
						2020-08-28 13:55:32 +02:00 
						 
				 
			
				
					
						
							
							
								Ines Montani 
							
						 
					 
					
						
						
						
						
							
						
						
							a51b4f3a19 
							
						 
					 
					
						
						
							
							Merge branch 'develop' into fix/language-config-interpolate-disk-bytes  
						
						
						
					 
					
						2020-08-28 13:21:17 +02:00 
						 
				 
			
				
					
						
							
							
								Ines Montani 
							
						 
					 
					
						
						
							
							
						
						
						
							
						
						
							03dde511b4 
							
						 
					 
					
						
						
							
							Merge pull request  #5987  from explosion/feature/debug-config [ci skip]  
						
						
						
					 
					
						2020-08-28 11:30:18 +02:00 
						 
				 
			
				
					
						
							
							
								Ines Montani 
							
						 
					 
					
						
						
						
						
							
						
						
							62e9967228 
							
						 
					 
					
						
						
							
							Merge branch 'develop' into fix/language-config-interpolate-disk-bytes  
						
						
						
					 
					
						2020-08-28 11:19:36 +02:00 
						 
				 
			
				
					
						
							
							
								Ines Montani 
							
						 
					 
					
						
						
						
						
							
						
						
							4ca2698f85 
							
						 
					 
					
						
						
							
							Merge branch 'develop' into feature/debug-config  
						
						
						
					 
					
						2020-08-28 11:19:17 +02:00 
						 
				 
			
				
					
						
							
							
								svlandeg 
							
						 
					 
					
						
						
						
						
							
						
						
							9a8255ffd5 
							
						 
					 
					
						
						
							
							two tests because of different exit type  
						
						
						
					 
					
						2020-08-28 10:50:26 +02:00 
						 
				 
			
				
					
						
							
							
								svlandeg 
							
						 
					 
					
						
						
						
						
							
						
						
							73baaf330a 
							
						 
					 
					
						
						
							
							update error type  
						
						
						
					 
					
						2020-08-28 10:46:21 +02:00 
						 
				 
			
				
					
						
							
							
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							c558ca4485 
							
						 
					 
					
						
						
							
							Merge branch 'develop' of  https://github.com/explosion/spaCy  into develop  
						
						
						
					 
					
						2020-08-27 19:47:26 +02:00 
						 
				 
			
				
					
						
							
							
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							d3ffe4ca63 
							
						 
					 
					
						
						
							
							Fix error when tagger was initialized with no labels  
						
						
						
					 
					
						2020-08-27 18:56:58 +02:00 
						 
				 
			
				
					
						
							
							
								Ines Montani 
							
						 
					 
					
						
						
						
						
							
						
						
							d1780db6a4 
							
						 
					 
					
						
						
							
							Tidy up and use different error [ci skip]  
						
						
						
					 
					
						2020-08-27 18:56:55 +02:00 
						 
				 
			
				
					
						
							
							
								Ines Montani 
							
						 
					 
					
						
						
						
						
							
						
						
							ff4175e839 
							
						 
					 
					
						
						
							
							Add more info to debug config  
						
						
						
					 
					
						2020-08-27 18:17:58 +02:00 
						 
				 
			
				
					
						
							
							
								Ines Montani 
							
						 
					 
					
						
						
						
						
							
						
						
							daac8ebacd 
							
						 
					 
					
						
						
							
							Don't interpolate config on Language deserialization  
						
						
						
					 
					
						2020-08-27 16:44:36 +02:00 
						 
				 
			
				
					
						
							
							
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							e1e1760fd6 
							
						 
					 
					
						
						
							
							Merge branch 'develop' of  https://github.com/explosion/spaCy  into develop  
						
						
						
					 
					
						2020-08-27 03:22:11 +02:00 
						 
				 
			
				
					
						
							
							
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							95adb58f15 
							
						 
					 
					
						
						
							
							Force tagger to pass batch of docs into model in begin_training  
						
						
						
					 
					
						2020-08-27 03:21:03 +02:00 
						 
				 
			
				
					
						
							
							
								Ines Montani 
							
						 
					 
					
						
						
							
							
						
						
						
							
						
						
							cdc114e212 
							
						 
					 
					
						
						
							
							Merge pull request  #5977  from explosion/refactor/vector-names  
						
						
						
					 
					
						2020-08-26 19:03:16 +02:00 
						 
				 
			
				
					
						
							
							
								Ines Montani 
							
						 
					 
					
						
						
							
							
						
						
						
							
						
						
							8692d176f6 
							
						 
					 
					
						
						
							
							Merge pull request  #5978  from explosion/feature/update-wasabi  
						
						... 
						
						
						
						Update wasabi: new diff_strings and MarkdownRenderer 
						
					 
					
						2020-08-26 19:02:52 +02:00 
						 
				 
			
				
					
						
							
							
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							9b22714a4e 
							
						 
					 
					
						
						
							
							Merge branch 'develop' of  https://github.com/explosion/spaCy  into develop  
						
						
						
					 
					
						2020-08-26 15:48:45 +02:00 
						 
				 
			
				
					
						
							
							
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							172af24f95 
							
						 
					 
					
						
						
							
							Fix upload and download  
						
						
						
					 
					
						2020-08-26 15:48:23 +02:00 
						 
				 
			
				
					
						
							
							
								Ines Montani 
							
						 
					 
					
						
						
						
						
							
						
						
							a5fff1df51 
							
						 
					 
					
						
						
							
							Remove outdated non-empty output dir warning [ci skip]  
						
						
						
					 
					
						2020-08-26 15:45:51 +02:00 
						 
				 
			
				
					
						
							
							
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							2d520d3b45 
							
						 
					 
					
						
						
							
							Remove unused error  
						
						
						
					 
					
						2020-08-26 15:41:14 +02:00 
						 
				 
			
				
					
						
							
							
								Adriane Boyd 
							
						 
					 
					
						
						
							
							
						
						
						
							
						
						
							90d88729e0 
							
						 
					 
					
						
						
							
							Add AttributeRuler.score ( #5963 )  
						
						... 
						
						
						
						* Add AttributeRuler.score
Add scoring for TAG / POS / MORPH / LEMMA if these are present in the
assigned token attributes.
Add default score weights (that don't really make a lot of sense) so
that the scores are in the default config in some form.
* Update docs 
						
					 
					
						2020-08-26 15:39:30 +02:00 
						 
				 
			
				
					
						
							
							
								Ines Montani 
							
						 
					 
					
						
						
						
						
							
						
						
							3aec98ca38 
							
						 
					 
					
						
						
							
							Update wasabi: new diff_strings and MarkdownRenderer  
						
						
						
					 
					
						2020-08-26 15:33:11 +02:00 
						 
				 
			
				
					
						
							
							
								Sofie Van Landeghem 
							
						 
					 
					
						
						
							
							
						
						
						
							
						
						
							79d460e3a2 
							
						 
					 
					
						
						
							
							Weights & Biases logger for train CLI ( #5971 )  
						
						... 
						
						
						
						* quick test as part of train script
* train_logger in config, default ConsoleLogger in loggers catalogue
* entitiy typo
* add wandb_logger
* cleanup
* Update spacy/cli/train_logger.py
Co-authored-by: Ines Montani <ines@ines.io>
* move loggers to gold.loggers
Co-authored-by: Ines Montani <ines@ines.io> 
						
					 
					
						2020-08-26 15:24:33 +02:00 
						 
				 
			
				
					
						
							
							
								Ines Montani 
							
						 
					 
					
						
						
							
							
						
						
						
							
						
						
							0997c30b9e 
							
						 
					 
					
						
						
							
							Merge pull request  #5974  from explosion/feature/project-document  
						
						
						
					 
					
						2020-08-26 15:14:13 +02:00 
						 
				 
			
				
					
						
							
							
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							191fb4144f 
							
						 
					 
					
						
						
							
							Merge branch 'develop' into refactor/vector-names  
						
						
						
					 
					
						2020-08-26 14:26:45 +02:00 
						 
				 
			
				
					
						
							
							
								Ines Montani 
							
						 
					 
					
						
						
						
						
							
						
						
							627617a079 
							
						 
					 
					
						
						
							
							Tidy up and add docs [ci skip]  
						
						
						
					 
					
						2020-08-26 13:24:55 +02:00 
						 
				 
			
				
					
						
							
							
								Adriane Boyd 
							
						 
					 
					
						
						
						
						
							
						
						
							43c61da209 
							
						 
					 
					
						
						
							
							Set macro AUC score in Scorer.score_cats  
						
						
						
					 
					
						2020-08-26 10:49:30 +02:00 
						 
				 
			
				
					
						
							
							
								Ines Montani 
							
						 
					 
					
						
						
						
						
							
						
						
							aeebc6678d 
							
						 
					 
					
						
						
							
							Small cleanup and adjustments  
						
						
						
					 
					
						2020-08-26 10:26:57 +02:00 
						 
				 
			
				
					
						
							
							
								Ines Montani 
							
						 
					 
					
						
						
						
						
							
						
						
							31567d1e42 
							
						 
					 
					
						
						
							
							Link project.yml  
						
						
						
					 
					
						2020-08-26 10:26:32 +02:00 
						 
				 
			
				
					
						
							
							
								Ines Montani 
							
						 
					 
					
						
						
						
						
							
						
						
							6c2a5ff53b 
							
						 
					 
					
						
						
							
							Auto-link local sources  
						
						
						
					 
					
						2020-08-26 10:26:06 +02:00 
						 
				 
			
				
					
						
							
							
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							77852d2428 
							
						 
					 
					
						
						
							
							Fix run_command for python 3.6  
						
						
						
					 
					
						2020-08-26 05:02:43 +02:00 
						 
				 
			
				
					
						
							
							
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							884cac5fb5 
							
						 
					 
					
						
						
							
							Make run_command backwards compatible  
						
						
						
					 
					
						2020-08-26 04:33:42 +02:00 
						 
				 
			
				
					
						
							
							
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							6547472347 
							
						 
					 
					
						
						
							
							Set version to v3.0.0a12  
						
						
						
					 
					
						2020-08-26 04:02:34 +02:00 
						 
				 
			
				
					
						
							
							
								Adriane Boyd 
							
						 
					 
					
						
						
							
							
						
						
						
							
						
						
							7d7b65ffd4 
							
						 
					 
					
						
						
							
							Fix raw strings in URL pattern ( #5972 )  
						
						... 
						
						
						
						Add missing raw string specifiers. 
						
					 
					
						2020-08-26 04:00:49 +02:00 
						 
				 
			
				
					
						
							
							
								Matthew Honnibal 
							
						 
					 
					
						
						
							
							
						
						
						
							
						
						
							2771e4f2b3 
							
						 
					 
					
						
						
							
							Fix the git "sparse checkout" functionality ( #5973 )  
						
						... 
						
						
						
						* Fix the git sparse checkout functionality
* Format 
						
					 
					
						2020-08-26 04:00:14 +02:00 
						 
				 
			
				
					
						
							
							
								Ines Montani 
							
						 
					 
					
						
						
						
						
							
						
						
							1c958a76c1 
							
						 
					 
					
						
						
							
							Add comment markers to only replace auto-generated docs  
						
						
						
					 
					
						2020-08-26 00:03:06 +02:00 
						 
				 
			
				
					
						
							
							
								Ines Montani 
							
						 
					 
					
						
						
						
						
							
						
						
							f10989e8c4 
							
						 
					 
					
						
						
							
							Add "project document" and more project.yml meta fields  
						
						
						
					 
					
						2020-08-25 17:14:27 +02:00 
						 
				 
			
				
					
						
							
							
								Ines Montani 
							
						 
					 
					
						
						
						
						
							
						
						
							fdcaf86c54 
							
						 
					 
					
						
						
							
							Adjust docstring  
						
						... 
						
						
						
						End sentence earlier so it's shown as a full sentence in --help 
						
					 
					
						2020-08-25 17:13:50 +02:00 
						 
				 
			
				
					
						
							
							
								Ines Montani 
							
						 
					 
					
						
						
						
						
							
						
						
							b89f6fa011 
							
						 
					 
					
						
						
							
							Fix meta defaults and error in package command  
						
						
						
					 
					
						2020-08-25 17:13:33 +02:00 
						 
				 
			
				
					
						
							
							
								Ines Montani 
							
						 
					 
					
						
						
						
						
							
						
						
							94705c21c8 
							
						 
					 
					
						
						
							
							Allow reuse on validators to prevent reload error  
						
						... 
						
						
						
						Otherwise this will cause an error if spaCy is live reloaded, e.g. in Streamlit 
						
					 
					
						2020-08-25 17:13:11 +02:00 
						 
				 
			
				
					
						
							
							
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							4f82a02b70 
							
						 
					 
					
						
						
							
							Remove 'fix_pretrained_vectors_name' hack  
						
						
						
					 
					
						2020-08-25 14:37:45 +02:00 
						 
				 
			
				
					
						
							
							
								Adriane Boyd 
							
						 
					 
					
						
						
							
							
						
						
						
							
						
						
							0bab7c8b91 
							
						 
					 
					
						
						
							
							Remove PRON_LEMMA symbol ( #5968 )  
						
						
						
					 
					
						2020-08-25 14:21:29 +02:00 
						 
				 
			
				
					
						
							
							
								Hiroshi Matsuda 
							
						 
					 
					
						
						
							
							
						
						
						
							
						
						
							332803eda9 
							
						 
					 
					
						
						
							
							fix ja leading spaces ( #5969 )  
						
						... 
						
						
						
						* change condition for space after
* add NAUGHTY_STRINGS test example 
						
					 
					
						2020-08-25 14:16:24 +02:00 
						 
				 
			
				
					
						
							
							
								Ines Montani 
							
						 
					 
					
						
						
						
						
							
						
						
							dd84577a98 
							
						 
					 
					
						
						
							
							Update CLI utils, project.yml schema and add test  
						
						
						
					 
					
						2020-08-25 11:54:53 +02:00 
						 
				 
			
				
					
						
							
							
								Shashank 
							
						 
					 
					
						
						
							
							
						
						
						
							
						
						
							450720aca2 
							
						 
					 
					
						
						
							
							Added support for Sanskrit language ( #5956 )  
						
						... 
						
						
						
						* Added support for Sanskrit language
* Added tests for lexical attribute like_num 
						
					 
					
						2020-08-25 10:56:29 +02:00 
						 
				 
			
				
					
						
							
							
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							ef43152af4 
							
						 
					 
					
						
						
							
							Update scorer  
						
						
						
					 
					
						2020-08-25 02:42:47 +02:00 
						 
				 
			
				
					
						
							
							
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							8d6e1ce306 
							
						 
					 
					
						
						
							
							Update v3.0.0a11  
						
						
						
					 
					
						2020-08-25 00:32:08 +02:00 
						 
				 
			
				
					
						
							
							
								Matthew Honnibal 
							
						 
					 
					
						
						
							
							
						
						
						
							
						
						
							8038b87f04 
							
						 
					 
					
						
						
							
							Various small tweaks to project CLI ( #5965 )  
						
						... 
						
						
						
						* Fix up/download of http and local paths
* Support git_sparse_checkout for assets
* Fix scorer
* Handle already-present directories for git assets
* Improve convert command
* Fix support for existant files in git assets
* Support branches in git sparse checkout
* Format
* Fix git assets
* Document git block in assets
* Fix test
* Fix test
* Revert "Fix test"
This reverts commit cf3097260f964d636e27 
						
					 
					
						2020-08-25 00:30:52 +02:00 
						 
				 
			
				
					
						
							
							
								Adriane Boyd 
							
						 
					 
					
						
						
							
							
						
						
						
							
						
						
							abd3f2b65a 
							
						 
					 
					
						
						
							
							Rename Polish lemmatizer method ( #5960 )  
						
						... 
						
						
						
						Rename Polish lemmatizer method to `pos_lookup` to distinguish it from
pure token-based lookup methods. 
						
					 
					
						2020-08-25 00:22:27 +02:00 
						 
				 
			
				
					
						
							
							
								Ines Montani 
							
						 
					 
					
						
						
							
							
						
						
						
							
						
						
							e12b03358b 
							
						 
					 
					
						
						
							
							Support removing extra values in fill-config ( #5966 )  
						
						... 
						
						
						
						* Support removing extra values in fill-config
* Fix test 
						
					 
					
						2020-08-24 22:53:47 +02:00 
						 
				 
			
				
					
						
							
							
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							f232d8db96 
							
						 
					 
					
						
						
							
							Report p/r/f out of 100  
						
						
						
					 
					
						2020-08-24 17:17:23 +02:00 
						 
				 
			
				
					
						
							
							
								Ines Montani 
							
						 
					 
					
						
						
							
							
						
						
						
							
						
						
							0e7f99da58 
							
						 
					 
					
						
						
							
							Fix handling of optional [pretraining] block ( #5954 )  
						
						... 
						
						
						
						* Fix handling of optional [pretraining] block
* Remote pretraining from default config
* Fix test
* Add schema option for empty pretrain block 
						
					 
					
						2020-08-24 15:56:03 +02:00 
						 
				 
			
				
					
						
							
							
								idoshr 
							
						 
					 
					
						
						
							
							
						
						
						
							
						
						
							b10c7bc56e 
							
						 
					 
					
						
						
							
							Hebrew like num ( #5952 )  
						
						... 
						
						
						
						* Update stop_words.py
Hebrew STOP WORDS
* Update stop_words.py
* contributor
* contributor
* add some common domain extentions
support human number 1K/1M....
* support human number 1K/1M....
* hebrew number tokenize
1K/1M implement in EN
* test human tokenize fix
* test
* heb like num
revert human number change
* heb like num 
						
					 
					
						2020-08-24 14:30:05 +02:00 
						 
				 
			
				
					
						
							
							
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							64df37643f 
							
						 
					 
					
						
						
							
							Update lockfile after project pull  
						
						
						
					 
					
						2020-08-24 03:27:09 +02:00 
						 
				 
			
				
					
						
							
							
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							588c28fe45 
							
						 
					 
					
						
						
							
							Fix project pull when deps missing  
						
						
						
					 
					
						2020-08-24 01:23:36 +02:00 
						 
				 
			
				
					
						
							
							
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							001546c19e 
							
						 
					 
					
						
						
							
							Set version to v3.0.0a10  
						
						
						
					 
					
						2020-08-23 21:15:38 +02:00 
						 
				 
			
				
					
						
							
							
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							160a855246 
							
						 
					 
					
						
						
							
							Format  
						
						
						
					 
					
						2020-08-23 21:15:12 +02:00 
						 
				 
			
				
					
						
							
							
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							89f5b8abb3 
							
						 
					 
					
						
						
							
							Fix project push  
						
						
						
					 
					
						2020-08-23 21:14:44 +02:00 
						 
				 
			
				
					
						
							
							
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							3828bc3ed0 
							
						 
					 
					
						
						
							
							Merge branch 'develop' of  https://github.com/explosion/spaCy  into develop  
						
						
						
					 
					
						2020-08-23 18:32:24 +02:00 
						 
				 
			
				
					
						
							
							
								Matthew Honnibal 
							
						 
					 
					
						
						
							
							
						
						
						
							
						
						
							e559867605 
							
						 
					 
					
						
						
							
							Allow spacy project to push and pull to/from remote storage ( #5949 )  
						
						... 
						
						
						
						* Add utils for working with remote storage
* WIP add remote_cache for project
* WIP add push and pull commands
* Use pathy in remote_cache
* Updarte util
* Update remote_cache
* Update util
* Update project assets
* Update pull script
* Update push script
* Fix type annotation in util
* Work on remote storage
* Remove site and env hash
* Fix imports
* Fix type annotation
* Require pathy
* Require pathy
* Fix import
* Add a util to handle project variable substitution
* Import push and pull commands
* Fix pull command
* Fix push command
* Fix tarfile in remote_storage
* Improve printing
* Fiddle with status messages
* Set version to v3.0.0a9
* Draft docs for spacy project remote storages
* Update docs [ci skip]
* Use Thinc config to simplify and unify template variables
* Auto-format
* Don't import Pathy globally for now
Causes slow and annoying Google Cloud warning
* Tidy up test
* Tidy up and update tests
* Update to latest Thinc
* Update docs
* variables -> vars
* Update docs [ci skip]
* Update docs [ci skip]
Co-authored-by: Ines Montani <ines@ines.io> 
						
					 
					
						2020-08-23 18:32:09 +02:00 
						 
				 
			
				
					
						
							
							
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							fe1cf7e124 
							
						 
					 
					
						
						
							
							Allow score_weights to list extra scores  
						
						
						
					 
					
						2020-08-23 18:31:30 +02:00 
						 
				 
			
				
					
						
							
							
								Ines Montani 
							
						 
					 
					
						
						
						
						
							
						
						
							9bdc9e81f5 
							
						 
					 
					
						
						
							
							Fix error message [ci skip]  
						
						
						
					 
					
						2020-08-23 12:14:02 +02:00 
						 
				 
			
				
					
						
							
							
								Sofie Van Landeghem 
							
						 
					 
					
						
						
							
							
						
						
						
							
						
						
							56eabcb2f2 
							
						 
					 
					
						
						
							
							Adding num_like test for Czech ( #5946 )  
						
						... 
						
						
						
						* Create lex_attrs.py
Hello,
I am missing a CZECH language in SpaCy. So I would like to help to push it a little. This file is base on others lex_attrs.py files just with translation to Czech.
* Update __init__.py
Updated for use with new Czech Lex_attrs file
* Update stop_words.py
* Create test_text.py
* add like_num testing for czech
Co-authored-by: holubvl3 <47881982+holubvl3@users.noreply.github.com>
Co-authored-by: holubvl3 <vilemrousi@gmail.com>
Co-authored-by: Vladimír Holubec <vholubec@arcdata.cz> 
						
					 
					
						2020-08-21 17:06:33 +02:00 
						 
				 
			
				
					
						
							
							
								holubvl3 
							
						 
					 
					
						
						
							
							
						
						
						
							
						
						
							a341b4ef09 
							
						 
					 
					
						
						
							
							Adding support for Czech language ( #5826 )  
						
						... 
						
						
						
						* Create lex_attrs.py
Hello,
I am missing a CZECH language in SpaCy. So I would like to help to push it a little. This file is base on others lex_attrs.py files just with translation to Czech.
* Update __init__.py
Updated for use with new Czech Lex_attrs file
* Update stop_words.py
* Create test_text.py
Co-authored-by: Vladimír Holubec <vholubec@arcdata.cz> 
						
					 
					
						2020-08-21 16:17:53 +02:00 
						 
				 
			
				
					
						
							
							
								svlandeg 
							
						 
					 
					
						
						
						
						
							
						
						
							af36d77d01 
							
						 
					 
					
						
						
							
							fix typo in docstring  
						
						
						
					 
					
						2020-08-21 15:56:03 +02:00 
						 
				 
			
				
					
						
							
							
								svlandeg 
							
						 
					 
					
						
						
						
						
							
						
						
							3060e4ae65 
							
						 
					 
					
						
						
							
							Merge remote-tracking branch 'upstream/develop' into feature/docs-docs-docs  
						
						... 
						
						
						
						# Conflicts:
#	website/src/widgets/quickstart-training-generator.js 
						
					 
					
						2020-08-21 15:16:30 +02:00 
						 
				 
			
				
					
						
							
							
								svlandeg 
							
						 
					 
					
						
						
						
						
							
						
						
							cc926267f8 
							
						 
					 
					
						
						
							
							small fixes  
						
						
						
					 
					
						2020-08-21 15:05:40 +02:00 
						 
				 
			
				
					
						
							
							
								Ines Montani 
							
						 
					 
					
						
						
						
						
							
						
						
							aa6a7cd6e7 
							
						 
					 
					
						
						
							
							Update docs and consistency [ci skip]  
						
						
						
					 
					
						2020-08-21 13:49:18 +02:00 
						 
				 
			
				
					
						
							
							
								Ines Montani 
							
						 
					 
					
						
						
							
							
						
						
						
							
						
						
							3826cfb8fe 
							
						 
					 
					
						
						
							
							Merge pull request  #5930  from svlandeg/feature/init-config-fix  
						
						... 
						
						
						
						UX for init config 
						
					 
					
						2020-08-21 12:06:33 +02:00 
						 
				 
			
				
					
						
							
							
								Ines Montani 
							
						 
					 
					
						
						
						
						
							
						
						
							79af7dcd6d 
							
						 
					 
					
						
						
							
							Small wording adjustments [ci skip]  
						
						
						
					 
					
						2020-08-21 12:06:19 +02:00 
						 
				 
			
				
					
						
							
							
								Ines Montani 
							
						 
					 
					
						
						
						
						
							
						
						
							e60442d83a 
							
						 
					 
					
						
						
							
							Adjust label casing in displaCy NER visualizer ( resolves   #4866 )  
						
						... 
						
						
						
						- Accept any case for label names in ents and colors option, even if actual predicted label uses different casing
- Don't text-transform: uppercase visually, if it's important to users that the label is represented as-is in the UI 
						
					 
					
						2020-08-21 11:51:31 +02:00 
						 
				 
			
				
					
						
							
							
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							c356e62908 
							
						 
					 
					
						
						
							
							Minor adjustments to quickstart template  
						
						
						
					 
					
						2020-08-21 00:10:21 +02:00 
						 
				 
			
				
					
						
							
							
								Ines Montani 
							
						 
					 
					
						
						
						
						
							
						
						
							6ad59d59fe 
							
						 
					 
					
						
						
							
							Merge branch 'develop' of  https://github.com/explosion/spaCy  into develop [ci skip]  
						
						
						
					 
					
						2020-08-20 11:20:58 +02:00 
						 
				 
			
				
					
						
							
							
								Sofie Van Landeghem 
							
						 
					 
					
						
						
							
							
						
						
						
							
						
						
							071c09ff35 
							
						 
					 
					
						
						
							
							add coding ( #5942 )  
						
						
						
					 
					
						2020-08-20 11:08:38 +02:00 
						 
				 
			
				
					
						
							
							
								Ines Montani 
							
						 
					 
					
						
						
							
							
						
						
						
							
						
						
							ea6640ea72 
							
						 
					 
					
						
						
							
							Merge pull request  #5939  from explosion/feature/thinc-v8.0.0a28  
						
						... 
						
						
						
						Update Thinc and config variables 
						
					 
					
						2020-08-19 21:14:36 +02:00 
						 
				 
			
				
					
						
							
							
								Ines Montani 
							
						 
					 
					
						
						
						
						
							
						
						
							3dd390b1a1 
							
						 
					 
					
						
						
							
							Update Thinc and config variables  
						
						
						
					 
					
						2020-08-19 19:46:12 +02:00 
						 
				 
			
				
					
						
							
							
								svlandeg 
							
						 
					 
					
						
						
						
						
							
						
						
							b96cd9fa5e 
							
						 
					 
					
						
						
							
							fix typo  
						
						
						
					 
					
						2020-08-19 18:46:08 +02:00 
						 
				 
			
				
					
						
							
							
								Ines Montani 
							
						 
					 
					
						
						
						
						
							
						
						
							e2f2ef3a5a 
							
						 
					 
					
						
						
							
							Update init config and recommendations  
						
						... 
						
						
						
						- As much as I dislike YAML, it seemed like a better format here because it allows us to add comments if we want to explain the different recommendations
- Don't include the generated JS in the repo by default and build it on the fly when running or deploying the site. This ensures it's always up to date.
- Simplify jinja_to_js script and use fewer dependencies 
						
					 
					
						2020-08-19 13:33:15 +02:00 
						 
				 
			
				
					
						
							
							
								Ines Montani 
							
						 
					 
					
						
						
							
							
						
						
						
							
						
						
							2285e59765 
							
						 
					 
					
						
						
							
							Merge pull request  #5933  from svlandeg/feature/more-v3-docs [ci skip]  
						
						
						
					 
					
						2020-08-19 11:29:02 +02:00 
						 
				 
			
				
					
						
							
							
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							c0f6e77a41 
							
						 
					 
					
						
						
							
							Set version to v3.0.0a8  
						
						
						
					 
					
						2020-08-18 23:29:00 +02:00 
						 
				 
			
				
					
						
							
							
								svlandeg 
							
						 
					 
					
						
						
						
						
							
						
						
							a8acedd4ba 
							
						 
					 
					
						
						
							
							example of custom reader and batcher  
						
						
						
					 
					
						2020-08-18 19:15:16 +02:00 
						 
				 
			
				
					
						
							
							
								Sofie Van Landeghem 
							
						 
					 
					
						
						
							
							
						
						
						
							
						
						
							358cbb21e3 
							
						 
					 
					
						
						
							
							Define candidate generator in EL config ( #5876 )  
						
						... 
						
						
						
						* candidate generator as separate part of EL config
* update comment
* ent instead of str as input for candidate generation
* Span instead of str: correct type indication
* fix types
* unit test to create new candidate generator
* fix replace_pipe argument passing
* move error message, general cleanup
* add vocab back to KB constructor
* provide KB as callable from Vocab arg
* rename to kb_loader, fix KB serialization as part of the EL pipe
* fix typo
* reformatting
* cleanup
* fix comment
* fix wrongly duplicated code from merge conflict
* rename dump to to_disk
* from_disk instead of load_bulk
* update test after recent removal of set_morphology in tagger
* remove old doc 
						
					 
					
						2020-08-18 16:10:36 +02:00 
						 
				 
			
				
					
						
							
							
								Sofie Van Landeghem 
							
						 
					 
					
						
						
							
							
						
						
						
							
						
						
							688e77562b 
							
						 
					 
					
						
						
							
							Train CLI script fixes ( #5931 )  
						
						... 
						
						
						
						* fix dash replacement in overrides arguments
* perform interpolation on training config
* make sure only .spacy files are read 
						
					 
					
						2020-08-18 16:06:37 +02:00 
						 
				 
			
				
					
						
							
							
								Ines Montani 
							
						 
					 
					
						
						
						
						
							
						
						
							82f0e20318 
							
						 
					 
					
						
						
							
							Update docs and consistency [ci skip]  
						
						
						
					 
					
						2020-08-18 14:39:40 +02:00 
						 
				 
			
				
					
						
							
							
								svlandeg 
							
						 
					 
					
						
						
						
						
							
						
						
							10e67b400c 
							
						 
					 
					
						
						
							
							output_file required, spacy-transformers prefered instead of required  
						
						
						
					 
					
						2020-08-18 13:38:43 +02:00 
						 
				 
			
				
					
						
							
							
								Ines Montani 
							
						 
					 
					
						
						
						
						
							
						
						
							1c3bcfb488 
							
						 
					 
					
						
						
							
							Update docs and util consistency  
						
						
						
					 
					
						2020-08-18 01:22:59 +02:00 
						 
				 
			
				
					
						
							
							
								Ines Montani 
							
						 
					 
					
						
						
						
						
							
						
						
							990c6b4c32 
							
						 
					 
					
						
						
							
							Update docs and CLI [ci skip]  
						
						
						
					 
					
						2020-08-17 21:38:20 +02:00 
						 
				 
			
				
					
						
							
							
								Ines Montani 
							
						 
					 
					
						
						
						
						
							
						
						
							3ae5e02f4f 
							
						 
					 
					
						
						
							
							Update docs, types and API consistency  
						
						
						
					 
					
						2020-08-17 16:45:24 +02:00 
						 
				 
			
				
					
						
							
							
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							a95a36ce2a 
							
						 
					 
					
						
						
							
							Set version to v3.0.0a7  
						
						
						
					 
					
						2020-08-16 15:51:05 +02:00 
						 
				 
			
				
					
						
							
							
								Ines Montani 
							
						 
					 
					
						
						
						
						
							
						
						
							6ae83bde0c 
							
						 
					 
					
						
						
							
							Fix CLI consistency [ci skip]  
						
						
						
					 
					
						2020-08-16 15:46:29 +02:00 
						 
				 
			
				
					
						
							
							
								Ines Montani 
							
						 
					 
					
						
						
							
							
						
						
						
							
						
						
							45f13cbf64 
							
						 
					 
					
						
						
							
							Merge pull request  #5916  from explosion/feature/new-thinc-config  
						
						
						
					 
					
						2020-08-16 15:24:12 +02:00 
						 
				 
			
				
					
						
							
							
								Ines Montani 
							
						 
					 
					
						
						
						
						
							
						
						
							34bda91695 
							
						 
					 
					
						
						
							
							Show warnings if there's nothing to auto-fill  
						
						
						
					 
					
						2020-08-16 14:19:43 +02:00 
						 
				 
			
				
					
						
							
							
								Ines Montani 
							
						 
					 
					
						
						
						
						
							
						
						
							dd5804d499 
							
						 
					 
					
						
						
							
							Update type hints  
						
						
						
					 
					
						2020-08-16 14:19:33 +02:00 
						 
				 
			
				
					
						
							
							
								Ines Montani 
							
						 
					 
					
						
						
						
						
							
						
						
							a570c304df 
							
						 
					 
					
						
						
							
							Update quickstart, template and docs  
						
						
						
					 
					
						2020-08-15 14:50:29 +02:00 
						 
				 
			
				
					
						
							
							
								Ines Montani 
							
						 
					 
					
						
						
							
							
						
						
						
							
						
						
							3272a63430 
							
						 
					 
					
						
						
							
							Merge pull request  #5920  from explosion/fix/logging-warning-various  
						
						
						
					 
					
						2020-08-15 14:41:15 +02:00 
						 
				 
			
				
					
						
							
							
								Ines Montani 
							
						 
					 
					
						
						
						
						
							
						
						
							fdcde9b0bf 
							
						 
					 
					
						
						
							
							Add init fill-config  
						
						
						
					 
					
						2020-08-14 16:49:26 +02:00 
						 
				 
			
				
					
						
							
							
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							9ebf39fb5f 
							
						 
					 
					
						
						
							
							Relax test  
						
						
						
					 
					
						2020-08-14 16:31:09 +02:00 
						 
				 
			
				
					
						
							
							
								Ines Montani 
							
						 
					 
					
						
						
						
						
							
						
						
							8128e5eb35 
							
						 
					 
					
						
						
							
							Replace lexeme_norm warning with logging  
						
						
						
					 
					
						2020-08-14 15:00:52 +02:00 
						 
				 
			
				
					
						
							
							
								Ines Montani 
							
						 
					 
					
						
						
						
						
							
						
						
							37814b608d 
							
						 
					 
					
						
						
							
							Remove env_opt and simplfy default Optimizer  
						
						
						
					 
					
						2020-08-14 14:59:54 +02:00 
						 
				 
			
				
					
						
							
							
								Ines Montani 
							
						 
					 
					
						
						
						
						
							
						
						
							ab1d165bba 
							
						 
					 
					
						
						
							
							Pass optimizer defined in config to resume/begin_training  
						
						... 
						
						
						
						Otherwise, this would create a default optimizer, which isn't what we want? 
						
					 
					
						2020-08-14 14:59:22 +02:00 
						 
				 
			
				
					
						
							
							
								Ines Montani 
							
						 
					 
					
						
						
						
						
							
						
						
							e4d0990857 
							
						 
					 
					
						
						
							
							Only receive from listener if listener exists  
						
						
						
					 
					
						2020-08-14 14:58:48 +02:00 
						 
				 
			
				
					
						
							
							
								Ines Montani 
							
						 
					 
					
						
						
						
						
							
						
						
							cef97e4b63 
							
						 
					 
					
						
						
							
							Fix path check  
						
						
						
					 
					
						2020-08-14 14:58:18 +02:00 
						 
				 
			
				
					
						
							
							
								Ines Montani 
							
						 
					 
					
						
						
						
						
							
						
						
							db2dbc8e59 
							
						 
					 
					
						
						
							
							Remove unused warning  
						
						
						
					 
					
						2020-08-14 14:58:03 +02:00 
						 
				 
			
				
					
						
							
							
								Ines Montani 
							
						 
					 
					
						
						
						
						
							
						
						
							67cc39af7f 
							
						 
					 
					
						
						
							
							Update Thinc and include section order  
						
						
						
					 
					
						2020-08-14 14:06:22 +02:00 
						 
				 
			
				
					
						
							
							
								Ines Montani 
							
						 
					 
					
						
						
						
						
							
						
						
							88b0a96801 
							
						 
					 
					
						
						
							
							Update for new Thinc and adjust config  
						
						
						
					 
					
						2020-08-13 17:38:30 +02:00 
						 
				 
			
				
					
						
							
							
								Adam Bittlingmayer 
							
						 
					 
					
						
						
							
							
						
						
						
							
						
						
							7b33b2854f 
							
						 
					 
					
						
						
							
							Add Armenian sentence-final verchaket,  Greek question mark and Arabic question mark to default punct ( #5910 )  
						
						... 
						
						
						
						* Add Armenian sentence-final verchaket
* Add Greek and Arabic question marks, and contributor agreement
* Check box 
						
					 
					
						2020-08-12 15:36:14 +02:00 
						 
				 
			
				
					
						
							
							
								graue70 
							
						 
					 
					
						
						
							
							
						
						
						
							
						
						
							49e690bde1 
							
						 
					 
					
						
						
							
							Fix typos in comments ( #5904 )  
						
						... 
						
						
						
						* Fix typo in comment
* Fix typo
* Add spaCy Contributor Agreement 
						
					 
					
						2020-08-12 15:35:25 +02:00 
						 
				 
			
				
					
						
							
							
								graue70 
							
						 
					 
					
						
						
							
							
						
						
						
							
						
						
							ba84371ab0 
							
						 
					 
					
						
						
							
							Use init parameter ( #5909 )  
						
						
						
					 
					
						2020-08-11 23:41:58 +02:00 
						 
				 
			
				
					
						
							
							
								Ines Montani 
							
						 
					 
					
						
						
							
							
						
						
						
							
						
						
							950832f087 
							
						 
					 
					
						
						
							
							Tidy up pipes ( #5906 )  
						
						... 
						
						
						
						* Tidy up pipes
* Fix init, defaults and raise custom errors
* Update docs
* Update docs [ci skip]
* Apply suggestions from code review
Co-authored-by: Matthew Honnibal <honnibal+gh@gmail.com>
* Tidy up error handling and validation, fix consistency
* Simplify get_examples check
* Remove unused import [ci skip]
Co-authored-by: Matthew Honnibal <honnibal+gh@gmail.com> 
						
					 
					
						2020-08-11 23:29:31 +02:00 
						 
				 
			
				
					
						
							
							
								Ines Montani 
							
						 
					 
					
						
						
						
						
							
						
						
							f79e4c094d 
							
						 
					 
					
						
						
							
							Remove generic type  
						
						... 
						
						
						
						Seems to cause error on Python 3.8 with Cython? 
						
					 
					
						2020-08-10 17:24:30 +02:00 
						 
				 
			
				
					
						
							
							
								Ines Montani 
							
						 
					 
					
						
						
						
						
							
						
						
							c099f6eece 
							
						 
					 
					
						
						
							
							Add Token.lex  
						
						
						
					 
					
						2020-08-10 16:43:52 +02:00 
						 
				 
			
				
					
						
							
							
								Ines Montani 
							
						 
					 
					
						
						
						
						
							
						
						
							933a7cf8d1 
							
						 
					 
					
						
						
							
							Fix Lexeme.from_ptr  
						
						
						
					 
					
						2020-08-10 16:43:37 +02:00 
						 
				 
			
				
					
						
							
							
								Ines Montani 
							
						 
					 
					
						
						
						
						
							
						
						
							64f2f84098 
							
						 
					 
					
						
						
							
							Update docstrings and docs [ci skip]  
						
						
						
					 
					
						2020-08-10 13:45:22 +02:00 
						 
				 
			
				
					
						
							
							
								Ines Montani 
							
						 
					 
					
						
						
						
						
							
						
						
							a4b448eec4 
							
						 
					 
					
						
						
							
							Remove unused compiler flag  
						
						
						
					 
					
						2020-08-10 13:13:18 +02:00 
						 
				 
			
				
					
						
							
							
								Ines Montani 
							
						 
					 
					
						
						
						
						
							
						
						
							3eaeb73342 
							
						 
					 
					
						
						
							
							Tidy up and auto-format  
						
						
						
					 
					
						2020-08-09 22:36:23 +02:00 
						 
				 
			
				
					
						
							
							
								Ines Montani 
							
						 
					 
					
						
						
						
						
							
						
						
							d5c78c7a34 
							
						 
					 
					
						
						
							
							Update docs and fix consistency  
						
						
						
					 
					
						2020-08-09 22:31:52 +02:00 
						 
				 
			
				
					
						
							
							
								Ines Montani 
							
						 
					 
					
						
						
						
						
							
						
						
							7c6854d8d4 
							
						 
					 
					
						
						
							
							Fix missing imports  
						
						
						
					 
					
						2020-08-09 22:28:29 +02:00 
						 
				 
			
				
					
						
							
							
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							0fc13b2f14 
							
						 
					 
					
						
						
							
							Set version to v3.0.0a6  
						
						
						
					 
					
						2020-08-09 21:53:32 +02:00 
						 
				 
			
				
					
						
							
							
								Ines Montani 
							
						 
					 
					
						
						
						
						
							
						
						
							a15c5fb191 
							
						 
					 
					
						
						
							
							Update docstrings and docs  
						
						
						
					 
					
						2020-08-09 16:10:48 +02:00 
						 
				 
			
				
					
						
							
							
								Ines Montani 
							
						 
					 
					
						
						
						
						
							
						
						
							8d2baa153d 
							
						 
					 
					
						
						
							
							Update tokenizer docs and add test  
						
						
						
					 
					
						2020-08-09 15:24:01 +02:00 
						 
				 
			
				
					
						
							
							
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							134d933d67 
							
						 
					 
					
						
						
							
							Add docstring for entity linker factory  
						
						
						
					 
					
						2020-08-09 15:19:28 +02:00 
						 
				 
			
				
					
						
							
							
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							992ee1c02f 
							
						 
					 
					
						
						
							
							Update tagger docstring  
						
						
						
					 
					
						2020-08-09 15:09:31 +02:00 
						 
				 
			
				
					
						
							
							
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							ebf9a7acbf 
							
						 
					 
					
						
						
							
							Add textcat docstring  
						
						
						
					 
					
						2020-08-09 15:07:09 +02:00 
						 
				 
			
				
					
						
							
							
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							8a13f510d6 
							
						 
					 
					
						
						
							
							Update tests  
						
						
						
					 
					
						2020-08-09 15:01:16 +02:00 
						 
				 
			
				
					
						
							
							
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							bbd8acd4bf 
							
						 
					 
					
						
						
							
							Add docstrings for parser and NER. Simplify some arguments  
						
						
						
					 
					
						2020-08-09 14:46:13 +02:00 
						 
				 
			
				
					
						
							
							
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							39a3d64c01 
							
						 
					 
					
						
						
							
							Add docstrings for Tok2Vec component  
						
						
						
					 
					
						2020-08-09 00:48:03 +02:00 
						 
				 
			
				
					
						
							
							
								Ines Montani 
							
						 
					 
					
						
						
							
							
						
						
						
							
						
						
							fd20f84927 
							
						 
					 
					
						
						
							
							Merge pull request  #5895  from explosion/docs/batchers  
						
						... 
						
						
						
						Draft docstrings for batchers 
						
					 
					
						2020-08-07 20:07:10 +02:00 
						 
				 
			
				
					
						
							
							
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							f5c4e0b751 
							
						 
					 
					
						
						
							
							Add docstrings for batchers  
						
						
						
					 
					
						2020-08-07 18:51:02 +02:00 
						 
				 
			
				
					
						
							
							
								Ines Montani 
							
						 
					 
					
						
						
						
						
							
						
						
							fe29ceec9e 
							
						 
					 
					
						
						
							
							Merge branch 'develop' into docs/model-docstrings  
						
						
						
					 
					
						2020-08-07 18:42:01 +02:00 
						 
				 
			
				
					
						
							
							
								Ines Montani 
							
						 
					 
					
						
						
						
						
							
						
						
							3a193eb8f1 
							
						 
					 
					
						
						
							
							Fix imports, types and default configs  
						
						
						
					 
					
						2020-08-07 18:40:54 +02:00 
						 
				 
			
				
					
						
							
							
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							b1d83fc13e 
							
						 
					 
					
						
						
							
							Fix imports  
						
						
						
					 
					
						2020-08-07 16:55:54 +02:00 
						 
				 
			
				
					
						
							
							
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							473504d837 
							
						 
					 
					
						
						
							
							Format  
						
						
						
					 
					
						2020-08-07 16:49:00 +02:00 
						 
				 
			
				
					
						
							
							
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							234c52a91e 
							
						 
					 
					
						
						
							
							Add tok2vec docstrings  
						
						
						
					 
					
						2020-08-07 16:48:48 +02:00 
						 
				 
			
				
					
						
							
							
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							547bc8a82b 
							
						 
					 
					
						
						
							
							Add docstring notes  
						
						
						
					 
					
						2020-08-07 16:17:34 +02:00 
						 
				 
			
				
					
						
							
							
								Ines Montani 
							
						 
					 
					
						
						
							
							
						
						
						
							
						
						
							6f3649923c 
							
						 
					 
					
						
						
							
							Merge pull request  #5893  from explosion/feature/validate-arg  
						
						
						
					 
					
						2020-08-07 15:47:20 +02:00 
						 
				 
			
				
					
						
							
							
								Adriane Boyd 
							
						 
					 
					
						
						
							
							
						
						
						
							
						
						
							e962784531 
							
						 
					 
					
						
						
							
							Add Lemmatizer and simplify related components ( #5848 )  
						
						... 
						
						
						
						* Add Lemmatizer and simplify related components
* Add `Lemmatizer` pipe with `lookup` and `rule` modes using the
`Lookups` tables.
* Reduce `Tagger` to a simple tagger that sets `Token.tag` (no pos or lemma)
* Reduce `Morphology` to only keep track of morph tags (no tag map, lemmatizer,
or morph rules)
* Remove lemmatizer from `Vocab`
* Adjust many many tests
Differences:
* No default lookup lemmas
* No special treatment of TAG in `from_array` and similar required
* Easier to modify labels in a `Tagger`
* No extra strings added from morphology / tag map
* Fix test
* Initial fix for Lemmatizer config/serialization
* Adjust init test to be more generic
* Adjust init test to force empty Lookups
* Add simple cache to rule-based lemmatizer
* Convert language-specific lemmatizers
Convert language-specific lemmatizers to component lemmatizers. Remove
previous lemmatizer class.
* Fix French and Polish lemmatizers
* Remove outdated UPOS conversions
* Update Russian lemmatizer init in tests
* Add minimal init/run tests for custom lemmatizers
* Add option to overwrite existing lemmas
* Update mode setting, lookup loading, and caching
* Make `mode` an immutable property
* Only enforce strict `load_lookups` for known supported modes
* Move caching into individual `_lemmatize` methods
* Implement strict when lang is not found in lookups
* Fix tables/lookups in make_lemmatizer
* Reallow provided lookups and allow for stricter checks
* Add lookups asset to all Lemmatizer pipe tests
* Rename lookups in lemmatizer init test
* Clean up merge
* Refactor lookup table loading
* Add helper from `load_lemmatizer_lookups` that loads required and
optional lookups tables based on settings provided by a config.
Additional slight refactor of lookups:
* Add `Lookups.set_table` to set a table from a provided `Table`
* Reorder class definitions to be able to specify type as `Table`
* Move registry assets into test methods
* Refactor lookups tables config
Use class methods within `Lemmatizer` to provide the config for
particular modes and to load the lookups from a config.
* Add pipe and score to lemmatizer
* Simplify Tagger.score
* Add missing import
* Clean up imports and auto-format
* Remove unused kwarg
* Tidy up and auto-format
* Update docstrings for Lemmatizer
Update docstrings for Lemmatizer.
Additionally modify `is_base_form` API to take `Token` instead of
individual features.
* Update docstrings
* Remove tag map values from Tagger.add_label
* Update API docs
* Fix relative link in Lemmatizer API docs 
						
					 
					
						2020-08-07 15:27:13 +02:00 
						 
				 
			
				
					
						
							
							
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							da6e59519e 
							
						 
					 
					
						
						
							
							Add docstrings for simple_ner  
						
						
						
					 
					
						2020-08-07 15:09:49 +02:00 
						 
				 
			
				
					
						
							
							
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							7ef8a64df9 
							
						 
					 
					
						
						
							
							Add docstring for parser  
						
						
						
					 
					
						2020-08-07 14:59:34 +02:00 
						 
				 
			
				
					
						
							
							
								Ines Montani 
							
						 
					 
					
						
						
						
						
							
						
						
							fc9a4fe827 
							
						 
					 
					
						
						
							
							Update attribute ruler  
						
						
						
					 
					
						2020-08-07 14:43:55 +02:00 
						 
				 
			
				
					
						
							
							
								Ines Montani 
							
						 
					 
					
						
						
						
						
							
						
						
							a8404c3517 
							
						 
					 
					
						
						
							
							validation -> validate  
						
						
						
					 
					
						2020-08-07 14:43:47 +02:00 
						 
				 
			
				
					
						
							
							
								Ines Montani 
							
						 
					 
					
						
						
						
						
							
						
						
							1d01d89b79 
							
						 
					 
					
						
						
							
							Update CLI docs and evaluate command [ci skip]  
						
						
						
					 
					
						2020-08-07 14:40:58 +02:00 
						 
				 
			
				
					
						
							
							
								Ines Montani 
							
						 
					 
					
						
						
							
							
						
						
						
							
						
						
							ef2c67cca5 
							
						 
					 
					
						
						
							
							Add DocBin to/from_disk methods and update docs ( #5892 )  
						
						... 
						
						
						
						* Add DocBin to/from_disk methods and update docs
* Use DocBin.from_disk in Corpus 
						
					 
					
						2020-08-07 14:30:59 +02:00 
						 
				 
			
				
					
						
							
							
								Ines Montani 
							
						 
					 
					
						
						
							
							
						
						
						
							
						
						
							4ca08c6d5d 
							
						 
					 
					
						
						
							
							Merge pull request  #5891  from adrianeboyd/docs/attribute-ruler-api  
						
						... 
						
						
						
						Add AttributeRuler API docs 
						
					 
					
						2020-08-07 13:55:12 +02:00 
						 
				 
			
				
					
						
							
							
								Adriane Boyd 
							
						 
					 
					
						
						
						
						
							
						
						
							b8d0c23857 
							
						 
					 
					
						
						
							
							Add AttributeRuler API docs  
						
						... 
						
						
						
						With additional minor updates to AttributeRuler docstrings. 
						
					 
					
						2020-08-07 12:43:23 +02:00 
						 
				 
			
				
					
						
							
							
								svlandeg 
							
						 
					 
					
						
						
						
						
							
						
						
							b17db0e994 
							
						 
					 
					
						
						
							
							Merge remote-tracking branch 'upstream/develop' into feature/el-docs  
						
						... 
						
						
						
						# Conflicts:
#	website/docs/usage/training.md 
						
					 
					
						2020-08-06 19:48:52 +02:00 
						 
				 
			
				
					
						
							
							
								Adriane Boyd 
							
						 
					 
					
						
						
							
							
						
						
						
							
						
						
							06c3a5e048 
							
						 
					 
					
						
						
							
							Add pipe to AttributeRuler ( #5889 )  
						
						
						
					 
					
						2020-08-06 19:43:09 +02:00 
						 
				 
			
				
					
						
							
							
								Ines Montani 
							
						 
					 
					
						
						
						
						
							
						
						
							9b7f198390 
							
						 
					 
					
						
						
							
							Fix format  
						
						
						
					 
					
						2020-08-06 19:30:53 +02:00 
						 
				 
			
				
					
						
							
							
								Ines Montani 
							
						 
					 
					
						
						
						
						
							
						
						
							3c4389110d 
							
						 
					 
					
						
						
							
							Remove unused imports  
						
						
						
					 
					
						2020-08-06 19:30:47 +02:00 
						 
				 
			
				
					
						
							
							
								Matthew Honnibal 
							
						 
					 
					
						
						
							
							
						
						
						
							
						
						
							d4525816ef 
							
						 
					 
					
						
						
							
							Be less choosy about reporting textcat scores ( #5879 )  
						
						... 
						
						
						
						* Set textcat scores more consistently
* Refactor textcat scores
* Fixes to scorer
* Add comments
* Add threshold
* Rename just 'f' to micro_f in textcat scorer
* Fix textcat score for two-class
* Fix syntax
* Fix textcat score
* Fix docstring 
						
					 
					
						2020-08-06 16:24:13 +02:00 
						 
				 
			
				
					
						
							
							
								svlandeg 
							
						 
					 
					
						
						
						
						
							
						
						
							0b4d1e1bc4 
							
						 
					 
					
						
						
							
							'debug data' instead of 'debug-data'  
						
						
						
					 
					
						2020-08-06 15:47:31 +02:00 
						 
				 
			
				
					
						
							
							
								svlandeg 
							
						 
					 
					
						
						
						
						
							
						
						
							881e3f8fd0 
							
						 
					 
					
						
						
							
							add docbin explanation and example  
						
						
						
					 
					
						2020-08-06 15:29:44 +02:00 
						 
				 
			
				
					
						
							
							
								Adriane Boyd 
							
						 
					 
					
						
						
							
							
						
						
						
							
						
						
							5e683a6e46 
							
						 
					 
					
						
						
							
							Fix return values for per feat score ( #5885 )  
						
						... 
						
						
						
						* Fix return values for per feat score
Convert `PRFScore` to dict as other per type scores.
* Update tests accordingly 
						
					 
					
						2020-08-06 15:14:47 +02:00 
						 
				 
			
				
					
						
							
							
								Ines Montani 
							
						 
					 
					
						
						
							
							
						
						
						
							
						
						
							913d21f0a3 
							
						 
					 
					
						
						
							
							Merge pull request  #5882  from explosion/feature/raise-from  
						
						... 
						
						
						
						Use "raise ... from" in custom errors for better tracebacks 
						
					 
					
						2020-08-06 00:35:26 +02:00 
						 
				 
			
				
					
						
							
							
								Ines Montani 
							
						 
					 
					
						
						
							
							
						
						
						
							
						
						
							06e80d95cd 
							
						 
					 
					
						
						
							
							Sync develop with nightly docs state ( #5883 )  
						
						... 
						
						
						
						Co-authored-by: svlandeg <sofie.vanlandeghem@gmail.com> 
						
					 
					
						2020-08-06 00:28:14 +02:00 
						 
				 
			
				
					
						
							
							
								Ines Montani 
							
						 
					 
					
						
						
							
							
						
						
						
							
						
						
							d92954ac1d 
							
						 
					 
					
						
						
							
							Merge pull request  #5881  from explosion/feature/better-error-model-shortcuts  
						
						
						
					 
					
						2020-08-06 00:13:35 +02:00 
						 
				 
			
				
					
						
							
							
								Ines Montani 
							
						 
					 
					
						
						
						
						
							
						
						
							56c17973aa 
							
						 
					 
					
						
						
							
							Use "raise ... from" in custom errors for better tracebacks  
						
						
						
					 
					
						2020-08-05 23:53:21 +02:00 
						 
				 
			
				
					
						
							
							
								Ines Montani 
							
						 
					 
					
						
						
							
							
						
						
						
							
						
						
							5cc0d89fad 
							
						 
					 
					
						
						
							
							Simplify config overrides in CLI and deserialization ( #5880 )  
						
						
						
					 
					
						2020-08-05 23:35:09 +02:00 
						 
				 
			
				
					
						
							
							
								Ines Montani 
							
						 
					 
					
						
						
						
						
							
						
						
							0881455a5d 
							
						 
					 
					
						
						
							
							Update error message  
						
						
						
					 
					
						2020-08-05 23:15:05 +02:00 
						 
				 
			
				
					
						
							
							
								Ines Montani 
							
						 
					 
					
						
						
						
						
							
						
						
							2a1fa86a0d 
							
						 
					 
					
						
						
							
							Add better error for failed model shortcut loading  
						
						
						
					 
					
						2020-08-05 23:10:29 +02:00 
						 
				 
			
				
					
						
							
							
								Ines Montani 
							
						 
					 
					
						
						
						
						
							
						
						
							c675746ca2 
							
						 
					 
					
						
						
							
							Update docstrings and types  
						
						
						
					 
					
						2020-08-05 20:29:46 +02:00 
						 
				 
			
				
					
						
							
							
								Ines Montani 
							
						 
					 
					
						
						
							
							
						
						
						
							
						
						
							823e533dc1 
							
						 
					 
					
						
						
							
							Add config callbacks for modifying nlp object before and after init ( #5866 )  
						
						... 
						
						
						
						* WIP: Concept for modifying nlp object before and after init
* Make callbacks return nlp object
Co-authored-by: Matthew Honnibal <honnibal+gh@gmail.com>
* Raise if callbacks don't return correct type
* Rename, update types, add after_pipeline_creation
Co-authored-by: Matthew Honnibal <honnibal+gh@gmail.com> 
						
					 
					
						2020-08-05 19:47:54 +02:00 
						 
				 
			
				
					
						
							
							
								Ines Montani 
							
						 
					 
					
						
						
						
						
							
						
						
							586d695775 
							
						 
					 
					
						
						
							
							Merge branch 'develop' of  https://github.com/explosion/spaCy  into develop  
						
						
						
					 
					
						2020-08-05 16:01:11 +02:00 
						 
				 
			
				
					
						
							
							
								Ines Montani 
							
						 
					 
					
						
						
						
						
							
						
						
							e68459296d 
							
						 
					 
					
						
						
							
							Tidy up and auto-format  
						
						
						
					 
					
						2020-08-05 16:00:59 +02:00 
						 
				 
			
				
					
						
							
							
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							50c0e49741 
							
						 
					 
					
						
						
							
							Fix train CLI  
						
						
						
					 
					
						2020-08-05 15:40:47 +02:00 
						 
				 
			
				
					
						
							
							
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							b9df4d6116 
							
						 
					 
					
						
						
							
							Fix textcat.begin_training if vectors set  
						
						
						
					 
					
						2020-08-05 15:40:36 +02:00 
						 
				 
			
				
					
						
							
							
								Adriane Boyd 
							
						 
					 
					
						
						
							
							
						
						
						
							
						
						
							4193402c47 
							
						 
					 
					
						
						
							
							Add warning when Matcher subpattern is discarded ( #5873 )  
						
						... 
						
						
						
						* Add a warning when a subpattern is not processed and discarded
* Normalize subpattern attribute/operator keys to upper case like
top-level attributes 
						
					 
					
						2020-08-05 14:56:14 +02:00 
						 
				 
			
				
					
						
							
							
								Adriane Boyd 
							
						 
					 
					
						
						
							
							
						
						
						
							
						
						
							af125875cf 
							
						 
					 
					
						
						
							
							Update SimpleNER ( #5878 )  
						
						... 
						
						
						
						* Fix `get_loss` to use NER annotation
* Add labels as part of cfg
* Add simple overfitting test 
						
					 
					
						2020-08-05 14:43:29 +02:00 
						 
				 
			
				
					
						
							
							
								Sofie Van Landeghem 
							
						 
					 
					
						
						
							
							
						
						
						
							
						
						
							b88c5c701a 
							
						 
					 
					
						
						
							
							Bugfix in nlp.replace_pipe ( #5875 )  
						
						... 
						
						
						
						* bugfix and unit test
* merge two conditions 
						
					 
					
						2020-08-05 09:30:58 +02:00 
						 
				 
			
				
					
						
							
							
								Ines Montani 
							
						 
					 
					
						
						
							
							
						
						
						
							
						
						
							b795f02fbd 
							
						 
					 
					
						
						
							
							Allow adding pipeline components from source model ( #5857 )  
						
						... 
						
						
						
						* Allow adding pipeline components from source model
* Config: name -> component
* Improve error messages
* Fix error and test
* Add frozen components and exclude logic
* Remove exclude from Language.evaluate
* Init sourced components with current vocab
* Fix error codes 
						
					 
					
						2020-08-04 23:39:19 +02:00 
						 
				 
			
				
					
						
							
							
								Sofie Van Landeghem 
							
						 
					 
					
						
						
							
							
						
						
						
							
						
						
							34873c4911 
							
						 
					 
					
						
						
							
							Example Dict format consistency ( #5858 )  
						
						... 
						
						
						
						* consistently use upper-case IDS in token_annotation format and for get_aligned
* remove ID from to_dict (not used in from_dict either)
* fix test
Co-authored-by: Matthew Honnibal <honnibal+gh@gmail.com> 
						
					 
					
						2020-08-04 22:22:26 +02:00 
						 
				 
			
				
					
						
							
							
								Adriane Boyd 
							
						 
					 
					
						
						
							
							
						
						
						
							
						
						
							fa79a0db9f 
							
						 
					 
					
						
						
							
							Add AttributeRuler for token attribute exceptions ( #5842 )  
						
						... 
						
						
						
						* Add AttributeRuler for token attribute exceptions
Add the `AttributeRuler` to handle exceptions for token-level
attributes. The `AttributeRuler` uses `Matcher` patterns to identify
target spans and applies the specified attributes to the token at the
provided index in the matched span. A negative index can be used to
index from the end of the matched span. The retokenizer is used to
"merge" the individual tokens and assign them the provided attributes.
Helper functions can import existing tag maps and morph rules to the
corresponding `Matcher` patterns.
There is an additional minor bug fix for `MORPH` attributes in the
retokenizer to correctly normalize the values and to handle `MORPH`
alongside `_` in an attrs dict.
* Fix default name
* Update name in error message
* Extend AttributeRuler functionality
* Add option to initialize with a dict of AttributeRuler patterns
* Instead of silently discarding overlapping matches (the default
behavior for the retokenizer if only the attrs differ), split the
matches into disjoint sets and retokenize each set separately. This
allows, for instance, one pattern to set the POS and another pattern to
set the lemma. (If two matches modify the same attribute, it looks like
the attrs are applied in the order they were added, but it may not be
deterministic?)
* Improve types
* Sort spans before processing
* Fix index boundaries in Span
* Refactor retokenizer to separate attrs methods
Add top-level `normalize_token_attrs` and `set_token_attrs` methods.
* Update AttributeRuler to use refactored methods
Update `AttributeRuler` to replace use of full retokenizer with only the
relevant methods for normalizing and setting attributes for a single
token.
* Update spacy/pipeline/attributeruler.py
Co-authored-by: Ines Montani <ines@ines.io>
* Make API more similar to EntityRuler
* Add `AttributeRuler.add_patterns` to add patterns from a list of dicts
* Return list of dicts as property `AttributeRuler.patterns`
* Make attrs_unnormed private
* Add test loading patterns from assets
* Revert "Fix index boundaries in Span"
This reverts commit 8f8a5c3386#5861 )
* Add Span index boundary checks
* Return Span-specific IndexError in all cases
* Simplify and fix if/else
Co-authored-by: Ines Montani <ines@ines.io> 
						
					 
					
						2020-08-04 17:02:39 +02:00 
						 
				 
			
				
					
						
							
							
								Sofie Van Landeghem 
							
						 
					 
					
						
						
							
							
						
						
						
							
						
						
							492d1ec5de 
							
						 
					 
					
						
						
							
							Prevent alignment when texts don't match ( #5867 )  
						
						... 
						
						
						
						* remove empty gold.pyx
* add alignment unit test (to be used in docs)
* ensure that Alignment is only used on equal texts
* additional test using example.alignment
* formatting
Co-authored-by: Matthew Honnibal <honnibal+gh@gmail.com> 
						
					 
					
						2020-08-04 16:29:18 +02:00 
						 
				 
			
				
					
						
							
							
								Matthew Honnibal 
							
						 
					 
					
						
						
							
							
						
						
						
							
						
						
							ecb3c4e8f4 
							
						 
					 
					
						
						
							
							Create corpus iterator and batcher from registry during training ( #5865 )  
						
						... 
						
						
						
						* Move batchers into their own module (and registry)
* Update CLI
* Update Corpus and batcher
* Update tests
* Update one config
* Merge 'evaluation' block back under [training]
* Import batchers in gold __init__
* Fix batchers
* Update config
* Update schema
* Update util
* Don't assume train and dev are actually paths
* Update onto-joint config
* Fix missing import
* Format
* Format
* Update spacy/gold/corpus.py
Co-authored-by: Ines Montani <ines@ines.io>
* Fix name
* Update default config
* Fix get_length option in batchers
* Update test
* Add comment
* Pass path into Corpus
* Update docstring
* Update schema and configs
* Update config
* Fix test
* Fix paths
* Fix print
* Fix create_train_batches
* [training.read_train] -> [training.train_corpus]
* Update onto-joint config
Co-authored-by: Ines Montani <ines@ines.io> 
						
					 
					
						2020-08-04 15:09:37 +02:00 
						 
				 
			
				
					
						
							
							
								Sofie Van Landeghem 
							
						 
					 
					
						
						
							
							
						
						
						
							
						
						
							82347110f5 
							
						 
					 
					
						
						
							
							Default empty KB in EL component ( #5872 )  
						
						... 
						
						
						
						* EL field documentation
* documentation consistent with docs
* default empty KB, initialize vocab separately
* formatting
* add test for changing the default entity vector length
* update comment 
						
					 
					
						2020-08-04 14:34:09 +02:00 
						 
				 
			
				
					
						
							
							
								Adriane Boyd 
							
						 
					 
					
						
						
							
							
						
						
						
							
						
						
							b7e3018d97 
							
						 
					 
					
						
						
							
							Recalculate alignment if tokenization differs ( #5868 )  
						
						... 
						
						
						
						* Recalculate alignment if tokenization differs
* Refactor cached alignment data 
						
					 
					
						2020-08-04 14:31:32 +02:00 
						 
				 
			
				
					
						
							
							
								Adriane Boyd 
							
						 
					 
					
						
						
							
							
						
						
						
							
						
						
							c62fd878a3 
							
						 
					 
					
						
						
							
							Allow Doc.char_span to snap to token boundaries ( #5849 )  
						
						... 
						
						
						
						* Allow Doc.char_span to snap to token boundaries
Add a `mode` option to allow `Doc.char_span` to snap to token
boundaries. The `mode` options:
* `strict`: character offsets must match token boundaries (default, same as
before)
* `inside`: all tokens completely within the character span
* `outside`: all tokens at least partially covered by the character span
Add a new helper function `token_by_char` that returns the token
corresponding to a character position in the text. Update
`token_by_start` and `token_by_end` to use `token_by_char` for more
efficient searching.
* Remove unused import
* Rename mode to alignment_mode
Rename `mode` to `alignment_mode` with the options
`strict`/`contract`/`expand`. Any unrecognized modes are silently
converted to `strict`. 
						
					 
					
						2020-08-04 13:36:32 +02:00 
						 
				 
			
				
					
						
							
							
								Adriane Boyd 
							
						 
					 
					
						
						
							
							
						
						
						
							
						
						
							b841248589 
							
						 
					 
					
						
						
							
							Add Span index boundary checks ( #5861 )  
						
						... 
						
						
						
						* Add Span index boundary checks
* Return Span-specific IndexError in all cases
* Simplify and fix if/else 
						
					 
					
						2020-08-04 13:35:25 +02:00 
						 
				 
			
				
					
						
							
							
								Adriane Boyd 
							
						 
					 
					
						
						
							
							
						
						
						
							
						
						
							cd59979ab4 
							
						 
					 
					
						
						
							
							Fix span boundary handling in Spanish noun_chunks ( #5860 )  
						
						
						
					 
					
						2020-08-03 13:53:15 +02:00 
						 
				 
			
				
					
						
							
							
								Ines Montani 
							
						 
					 
					
						
						
							
							
						
						
						
							
						
						
							934447a611 
							
						 
					 
					
						
						
							
							Merge pull request  #5855  from svlandeg/fix/cli-debug  
						
						
						
					 
					
						2020-08-03 13:09:20 +02:00 
						 
				 
			
				
					
						
							
							
								Ines Montani 
							
						 
					 
					
						
						
							
							
						
						
						
							
						
						
							4c055f0aa7 
							
						 
					 
					
						
						
							
							Add init CLI and init config ( #5854 )  
						
						... 
						
						
						
						* Add init CLI and init config draft
* Improve config validation
* Auto-format
* Don't export anything in debug config
* Update docs 
						
					 
					
						2020-08-02 15:18:30 +02:00 
						 
				 
			
				
					
						
							
							
								svlandeg 
							
						 
					 
					
						
						
						
						
							
						
						
							6f4e46ee93 
							
						 
					 
					
						
						
							
							Merge remote-tracking branch 'upstream/develop' into fix/cli-debug  
						
						... 
						
						
						
						# Conflicts:
#	pyproject.toml
#	requirements.txt
#	setup.cfg 
						
					 
					
						2020-08-01 18:38:59 +02:00 
						 
				 
			
				
					
						
							
							
								Ines Montani 
							
						 
					 
					
						
						
						
						
							
						
						
							b40f44419b 
							
						 
					 
					
						
						
							
							Simplify pipe analysis  
						
						... 
						
						
						
						- remove unused code
- don't print by default
- integrate attrs info into analysis output 
						
					 
					
						2020-08-01 13:40:06 +02:00 
						 
				 
			
				
					
						
							
							
								Ines Montani 
							
						 
					 
					
						
						
						
						
							
						
						
							b68c53858c 
							
						 
					 
					
						
						
							
							Remove global  
						
						
						
					 
					
						2020-07-31 18:37:58 +02:00 
						 
				 
			
				
					
						
							
							
								Ines Montani 
							
						 
					 
					
						
						
						
						
							
						
						
							30a76fcf6f 
							
						 
					 
					
						
						
							
							Integrate and simplify pipe analysis  
						
						
						
					 
					
						2020-07-31 18:34:35 +02:00 
						 
				 
			
				
					
						
							
							
								svlandeg 
							
						 
					 
					
						
						
						
						
							
						
						
							9b719dfb1a 
							
						 
					 
					
						
						
							
							use divider inbetween steps  
						
						
						
					 
					
						2020-07-31 18:06:48 +02:00 
						 
				 
			
				
					
						
							
							
								svlandeg 
							
						 
					 
					
						
						
						
						
							
						
						
							51ffc4a166 
							
						 
					 
					
						
						
							
							rename pipe_name to component  
						
						
						
					 
					
						2020-07-31 17:58:55 +02:00 
						 
				 
			
				
					
						
							
							
								svlandeg 
							
						 
					 
					
						
						
						
						
							
						
						
							878327d38e 
							
						 
					 
					
						
						
							
							printing final predictions by default to False  
						
						
						
					 
					
						2020-07-31 17:36:32 +02:00 
						 
				 
			
				
					
						
							
							
								Ines Montani 
							
						 
					 
					
						
						
						
						
							
						
						
							2d955fbf98 
							
						 
					 
					
						
						
							
							Fix linting [ci skip]  
						
						
						
					 
					
						2020-07-31 17:05:28 +02:00 
						 
				 
			
				
					
						
							
							
								Ines Montani 
							
						 
					 
					
						
						
						
						
							
						
						
							e9e8fa2466 
							
						 
					 
					
						
						
							
							Update docs and types  
						
						
						
					 
					
						2020-07-31 17:02:54 +02:00 
						 
				 
			
				
					
						
							
							
								svlandeg 
							
						 
					 
					
						
						
						
						
							
						
						
							cc2f58a1b0 
							
						 
					 
					
						
						
							
							use data_validation context manager  
						
						
						
					 
					
						2020-07-31 16:49:42 +02:00 
						 
				 
			
				
					
						
							
							
								Adriane Boyd 
							
						 
					 
					
						
						
							
							
						
						
						
							
						
						
							ac14ce7c30 
							
						 
					 
					
						
						
							
							Prefer earlier spans in EntityRuler ( #5843 )  
						
						... 
						
						
						
						Similar to #4414 , update the sorting in EntityRuler to prefer the first
span in overlapping spans. 
						
					 
					
						2020-07-31 16:09:32 +02:00 
						 
				 
			
				
					
						
							
							
								svlandeg 
							
						 
					 
					
						
						
						
						
							
						
						
							5fa3235d06 
							
						 
					 
					
						
						
							
							set DATA_VALIDATION to False for debug_model (upgrade thinc)  
						
						
						
					 
					
						2020-07-31 15:21:01 +02:00 
						 
				 
			
				
					
						
							
							
								svlandeg 
							
						 
					 
					
						
						
						
						
							
						
						
							08d3c36c20 
							
						 
					 
					
						
						
							
							bugfix in train CLI  
						
						
						
					 
					
						2020-07-31 15:03:43 +02:00 
						 
				 
			
				
					
						
							
							
								Adriane Boyd 
							
						 
					 
					
						
						
						
						
							
						
						
							9b509aa87f 
							
						 
					 
					
						
						
							
							Move Language.evaluate scorer config to new arg  
						
						... 
						
						
						
						Move `Language.evaluate` scorer config from `component_cfg` to separate
argument `scorer_cfg`. 
						
					 
					
						2020-07-31 11:05:16 +02:00 
						 
				 
			
				
					
						
							
							
								Adriane Boyd 
							
						 
					 
					
						
						
						
						
							
						
						
							901801b33b 
							
						 
					 
					
						
						
							
							Fix default arguments in DependencyParser.score  
						
						
						
					 
					
						2020-07-31 10:55:44 +02:00 
						 
				 
			
				
					
						
							
							
								Adriane Boyd 
							
						 
					 
					
						
						
						
						
							
						
						
							9d79916792 
							
						 
					 
					
						
						
							
							Merge branch 'develop' into feature/scorer-adjustments  
						
						
						
					 
					
						2020-07-31 10:48:14 +02:00 
						 
				 
			
				
					
						
							
							
								Sofie Van Landeghem 
							
						 
					 
					
						
						
							
							
						
						
						
							
						
						
							ca491722ad 
							
						 
					 
					
						
						
							
							The Parser is now a Pipe (2) ( #5844 )  
						
						... 
						
						
						
						* moving syntax folder to _parser_internals
* moving nn_parser and transition_system
* move nn_parser and transition_system out of internals folder
* moving nn_parser code into transition_system file
* rename transition_system to transition_parser
* moving parser_model and _state to ml
* move _state back to internals
* The Parser now inherits from Pipe!
* small code fixes
* removing unnecessary imports
* remove link_vectors_to_models
* transition_system to internals folder
* little bit more cleanup
* newlines 
						
					 
					
						2020-07-30 23:30:54 +02:00 
						 
				 
			
				
					
						
							
							
								svlandeg 
							
						 
					 
					
						
						
						
						
							
						
						
							0b23594953 
							
						 
					 
					
						
						
							
							pipe_name instead of section in debug_model  
						
						
						
					 
					
						2020-07-30 20:06:28 +02:00 
						 
				 
			
				
					
						
							
							
								Rahul Gupta 
							
						 
					 
					
						
						
							
							
						
						
						
							
						
						
							f76fae0e8d 
							
						 
					 
					
						
						
							
							English: adds ordinal numbers ( #5830 )  
						
						
						
					 
					
						2020-07-29 20:22:47 +02:00 
						 
				 
			
				
					
						
							
							
								Ines Montani 
							
						 
					 
					
						
						
							
							
						
						
						
							
						
						
							7a21775cd0 
							
						 
					 
					
						
						
							
							Merge pull request  #5834  from explosion/feature/vectors  
						
						
						
					 
					
						2020-07-29 18:49:26 +02:00 
						 
				 
			
				
					
						
							
							
								Gustavo Zadrozny Leyendecker 
							
						 
					 
					
						
						
							
							
						
						
						
							
						
						
							90b958fd01 
							
						 
					 
					
						
						
							
							Fix on EntityRendered to support break lines (after last entity) ( closes   #5838 )  
						
						
						
					 
					
						2020-07-29 18:48:39 +02:00 
						 
				 
			
				
					
						
							
							
								Ines Montani 
							
						 
					 
					
						
						
						
						
							
						
						
							b0f57a0cac 
							
						 
					 
					
						
						
							
							Update docs and consistency  
						
						
						
					 
					
						2020-07-29 15:14:07 +02:00 
						 
				 
			
				
					
						
							
							
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							a2d573c039 
							
						 
					 
					
						
						
							
							Merge branch 'feature/vectors' of  https://github.com/explosion/spaCy  into feature/vectors  
						
						
						
					 
					
						2020-07-29 14:56:27 +02:00 
						 
				 
			
				
					
						
							
							
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							2af741d7e3 
							
						 
					 
					
						
						
							
							Fix train arg  
						
						
						
					 
					
						2020-07-29 14:56:01 +02:00 
						 
				 
			
				
					
						
							
							
								Matthew Honnibal 
							
						 
					 
					
						
						
							
							
						
						
						
							
						
						
							c27309f839 
							
						 
					 
					
						
						
							
							Merge branch 'develop' into feature/vectors  
						
						
						
					 
					
						2020-07-29 14:54:10 +02:00 
						 
				 
			
				
					
						
							
							
								Ines Montani 
							
						 
					 
					
						
						
						
						
							
						
						
							62266fb828 
							
						 
					 
					
						
						
							
							Fix broken type annotation  
						
						
						
					 
					
						2020-07-29 14:49:49 +02:00 
						 
				 
			
				
					
						
							
							
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							142b58be92 
							
						 
					 
					
						
						
							
							Fix import  
						
						
						
					 
					
						2020-07-29 14:45:09 +02:00 
						 
				 
			
				
					
						
							
							
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							c99a653070 
							
						 
					 
					
						
						
							
							Adjust textcat model  
						
						
						
					 
					
						2020-07-29 14:38:15 +02:00 
						 
				 
			
				
					
						
							
							
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							9e1b11dd81 
							
						 
					 
					
						
						
							
							Update vectors in textcat  
						
						
						
					 
					
						2020-07-29 14:35:36 +02:00 
						 
				 
			
				
					
						
							
							
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							105cf29967 
							
						 
					 
					
						
						
							
							Fix DocBin  
						
						
						
					 
					
						2020-07-29 14:23:13 +02:00 
						 
				 
			
				
					
						
							
							
								Ines Montani 
							
						 
					 
					
						
						
						
						
							
						
						
							ff0bc05da8 
							
						 
					 
					
						
						
							
							Fix docstrings [ci skip]  
						
						
						
					 
					
						2020-07-29 14:09:37 +02:00 
						 
				 
			
				
					
						
							
							
								Ines Montani 
							
						 
					 
					
						
						
						
						
							
						
						
							6e2623d3f8 
							
						 
					 
					
						
						
							
							Fix docstring [ci skip]  
						
						
						
					 
					
						2020-07-29 14:08:05 +02:00 
						 
				 
			
				
					
						
							
							
								Ines Montani 
							
						 
					 
					
						
						
						
						
							
						
						
							8d56260d92 
							
						 
					 
					
						
						
							
							Fix docstrings [ci skip]  
						
						
						
					 
					
						2020-07-29 14:07:13 +02:00 
						 
				 
			
				
					
						
							
							
								Ines Montani 
							
						 
					 
					
						
						
						
						
							
						
						
							80b18124d2 
							
						 
					 
					
						
						
							
							Fix docstring [ci skip]  
						
						
						
					 
					
						2020-07-29 14:03:35 +02:00