Ines Montani 
							
						 
					 
					
						
						
						
						
							
						
						
							5bbdd7dc4c 
							
						 
					 
					
						
						
							
							Update pipeline design docs [ci skip]  
						
						
						
					 
					
						2021-04-06 14:13:22 +10:00 
						 
				 
			
				
					
						
							
							
								Ines Montani 
							
						 
					 
					
						
						
						
						
							
						
						
							1d1cfadbca 
							
						 
					 
					
						
						
							
							Fix formatting [ci skip]  
						
						
						
					 
					
						2021-04-06 14:13:13 +10:00 
						 
				 
			
				
					
						
							
							
								Jaidev Deshpande 
							
						 
					 
					
						
						
							
							
						
						
						
							
						
						
							93ee74a0a6 
							
						 
					 
					
						
						
							
							Add Numerizer to SpaCy universe ( #7650 )  
						
						... 
						
						
						
						Numerizer is a spaCy extension that converts numbers written in natural language
into numeric strings. 
						
					 
					
						2021-04-05 19:02:27 +02:00 
						 
				 
			
				
					
						
							
							
								Paul O'Leary McCann 
							
						 
					 
					
						
						
							
							
						
						
						
							
						
						
							7944761ba7 
							
						 
					 
					
						
						
							
							Add warning if initial vectors are empty ( #7641 )  
						
						... 
						
						
						
						See #7637 , where this came up. 
						
					 
					
						2021-04-04 20:20:24 +02:00 
						 
				 
			
				
					
						
							
							
								Sam Edwardes 
							
						 
					 
					
						
						
							
							
						
						
						
							
						
						
							f6ad4684bd 
							
						 
					 
					
						
						
							
							Updates to universe.json for spaCyTextBlob ( #7647 )  
						
						... 
						
						
						
						* Updates to universe.json for spaCyTextBlob
Updated the documentation for spaCy 3.0.
* SamEdwardes.md
* Update SamEdwardes.md 
						
					 
					
						2021-04-04 20:17:57 +02:00 
						 
				 
			
				
					
						
							
							
								Ayush Chaurasia 
							
						 
					 
					
						
						
							
							
						
						
						
							
						
						
							3c2ce41dd8 
							
						 
					 
					
						
						
							
							W&B integration: Optional support for dataset and model checkpoint logging and versioning  ( #7429 )  
						
						... 
						
						
						
						* Add optional artifacts logging
* Update docs
* Update spacy/training/loggers.py
Co-authored-by: Sofie Van Landeghem <svlandeg@users.noreply.github.com>
* Update spacy/training/loggers.py
Co-authored-by: Sofie Van Landeghem <svlandeg@users.noreply.github.com>
* Update spacy/training/loggers.py
Co-authored-by: Sofie Van Landeghem <svlandeg@users.noreply.github.com>
* Bump WandbLogger Version
* Add documentation of v1 to legacy docs
* bump spacy-legacy to 3.0.2 (to be released)
Co-authored-by: Sofie Van Landeghem <svlandeg@users.noreply.github.com>
Co-authored-by: svlandeg <sofie.vanlandeghem@gmail.com> 
						
					 
					
						2021-04-01 19:36:23 +02:00 
						 
				 
			
				
					
						
							
							
								vincent d warmerdam 
							
						 
					 
					
						
						
							
							
						
						
						
							
						
						
							8b3eec6e62 
							
						 
					 
					
						
						
							
							Add Tokenwiser to Projects ( #7541 )  
						
						... 
						
						
						
						* Add tokenwiser
* Update universe.json 
						
					 
					
						2021-04-01 14:39:36 +02:00 
						 
				 
			
				
					
						
							
							
								Sofie Van Landeghem 
							
						 
					 
					
						
						
							
							
						
						
						
							
						
						
							59c2069eb1 
							
						 
					 
					
						
						
							
							Legacy docs ( #7601 )  
						
						... 
						
						
						
						* document legacy Tok2Vec architectures
* add TextCatEnsemble.v1 legacy documentation
* Separate legacy section in side bar 
						
					 
					
						2021-03-30 12:43:14 +02:00 
						 
				 
			
				
					
						
							
							
								Adriane Boyd 
							
						 
					 
					
						
						
							
							
						
						
						
							
						
						
							348d1829c7 
							
						 
					 
					
						
						
							
							Preserve user data for DependencyMatcher on spans ( #7528 )  
						
						... 
						
						
						
						* Preserve user data for DependencyMatcher on spans
* Clean underscore in test
* Modify test to use extensions stored in user data 
						
					 
					
						2021-03-30 12:26:22 +02:00 
						 
				 
			
				
					
						
							
							
								m0canu1 
							
						 
					 
					
						
						
							
							
						
						
						
							
						
						
							921feee092 
							
						 
					 
					
						
						
							
							Added more exception to the italian language from  https://forum.wordr … ( #7246 )  
						
						... 
						
						
						
						* Added more exception to the italian language from https://forum.wordreference.com/threads/le-abbreviazioni-nella-lingua-italiana-abbreviations-in-italian.2464189/ 
* Remove unnecessary exception
Co-authored-by: Alexandru Mocanu <alexandru.mocanu@augeos.it>
Co-authored-by: Adriane Boyd <adrianeboyd@gmail.com> 
						
					 
					
						2021-03-30 10:23:32 +02:00 
						 
				 
			
				
					
						
							
							
								Adriane Boyd 
							
						 
					 
					
						
						
							
							
						
						
						
							
						
						
							27a48f2802 
							
						 
					 
					
						
						
							
							Fix/update extension copying in Span.as_doc and Doc.from_docs ( #7574 )  
						
						... 
						
						
						
						* Adjust custom extension data when copying user data in `Span.as_doc()`
* Restrict `Doc.from_docs()` to adjusting offsets for custom extension
data
  * Update test to use extension
  * (Duplicate bug fix for character offset from #7497 ) 
						
					 
					
						2021-03-30 09:49:12 +02:00 
						 
				 
			
				
					
						
							
							
								Santiago Castro 
							
						 
					 
					
						
						
							
							
						
						
						
							
						
						
							af07fc3bc1 
							
						 
					 
					
						
						
							
							Add support for CUDA 11.2 ( #7583 )  
						
						... 
						
						
						
						* Add support for CUDA 11.2
* Update the docs
* Format
Co-authored-by: Adriane Boyd <adrianeboyd@gmail.com> 
						
					 
					
						2021-03-30 09:47:33 +02:00 
						 
				 
			
				
					
						
							
							
								Álvaro Abella Bascarán 
							
						 
					 
					
						
						
							
							
						
						
						
							
						
						
							5b4dde38a3 
							
						 
					 
					
						
						
							
							fix fn name: tokenizer.infixes_finditer -> tokenizer.infix_finditer ( #7606 )  
						
						
						
					 
					
						2021-03-30 09:45:49 +02:00 
						 
				 
			
				
					
						
							
							
								Adriane Boyd 
							
						 
					 
					
						
						
							
							
						
						
						
							
						
						
							3ae8661085 
							
						 
					 
					
						
						
							
							Fix tensor retokenization for non-numpy ops ( #7527 )  
						
						... 
						
						
						
						Implement manual `append` and `delete` for non-numpy ops. 
						
					 
					
						2021-03-29 22:34:48 +11:00 
						 
				 
			
				
					
						
							
							
								Adriane Boyd 
							
						 
					 
					
						
						
							
							
						
						
						
							
						
						
							139f655f34 
							
						 
					 
					
						
						
							
							Merge doc.spans in Doc.from_docs() ( #7497 )  
						
						... 
						
						
						
						Merge data from `doc.spans` in `Doc.from_docs()`.
* Fix internal character offset set when merging empty docs (only
affects tokens and spans in `user_data` if an empty doc is in the list
of docs) 
						
					 
					
						2021-03-29 22:34:01 +11:00 
						 
				 
			
				
					
						
							
							
								Adriane Boyd 
							
						 
					 
					
						
						
							
							
						
						
						
							
						
						
							d59f968d08 
							
						 
					 
					
						
						
							
							Keep sent starts without parse in retokenization ( #7424 )  
						
						... 
						
						
						
						In the retokenizer, only reset sent starts (with
`set_children_from_head`) if the doc is parsed. If there is no parse,
merged tokens have the unset `token.is_sent_start == None` by default after
retokenization. 
						
					 
					
						2021-03-29 22:32:00 +11:00 
						 
				 
			
				
					
						
							
							
								Paul O'Leary McCann 
							
						 
					 
					
						
						
							
							
						
						
						
							
						
						
							faed54d659 
							
						 
					 
					
						
						
							
							Merge pull request  #7537  from polm/docs/patience-negative  
						
						... 
						
						
						
						Remove mention of -1 for early stopping (fix  #7535 ) 
						
					 
					
						2021-03-26 21:11:53 +09:00 
						 
				 
			
				
					
						
							
							
								Paul O'Leary McCann 
							
						 
					 
					
						
						
						
						
							
						
						
							cdab341a75 
							
						 
					 
					
						
						
							
							Remove mention of -1 for early stopping ( fix   #7535 )  
						
						... 
						
						
						
						Maybe this used to work differently, but currently a negative patience
just causes immediate termination. 
						
					 
					
						2021-03-23 11:50:35 +09:00 
						 
				 
			
				
					
						
							
							
								Ines Montani 
							
						 
					 
					
						
						
							
							
						
						
						
							
						
						
							4bd3d01aaf 
							
						 
					 
					
						
						
							
							Merge pull request  #7471  from polm/fix/listener-warnings  
						
						
						
					 
					
						2021-03-22 12:45:02 +01:00 
						 
				 
			
				
					
						
							
							
								Ines Montani 
							
						 
					 
					
						
						
							
							
						
						
						
							
						
						
							d545ab4ca4 
							
						 
					 
					
						
						
							
							Merge pull request  #7495  from adrianeboyd/bugfix/norm-ux  
						
						... 
						
						
						
						Update lexeme_norm checks 
						
					 
					
						2021-03-22 12:44:52 +01:00 
						 
				 
			
				
					
						
							
							
								Ines Montani 
							
						 
					 
					
						
						
							
							
						
						
						
							
						
						
							be55f43163 
							
						 
					 
					
						
						
							
							Merge pull request  #7473  from adrianeboyd/docs/v3-pipeline-deps-order  
						
						
						
					 
					
						2021-03-22 12:43:07 +01:00 
						 
				 
			
				
					
						
							
							
								Ines Montani 
							
						 
					 
					
						
						
							
							
						
						
						
							
						
						
							3ee2fcfba0 
							
						 
					 
					
						
						
							
							Merge pull request  #7483  from adrianeboyd/docs/various-v3-4 [ci skip]  
						
						
						
					 
					
						2021-03-22 12:37:06 +01:00 
						 
				 
			
				
					
						
							
							
								Ines Montani 
							
						 
					 
					
						
						
							
							
						
						
						
							
						
						
							88e5a0dc16 
							
						 
					 
					
						
						
							
							Merge pull request  #7504  from polm/fix/lexeme-docs [ci skip]  
						
						... 
						
						
						
						Fix mismatched backtick in Lexeme docs 
						
					 
					
						2021-03-22 12:36:44 +01:00 
						 
				 
			
				
					
						
							
							
								Ines Montani 
							
						 
					 
					
						
						
							
							
						
						
						
							
						
						
							66ebd5c69e 
							
						 
					 
					
						
						
							
							Merge pull request  #7491  from adrianeboyd/bugfix/corpus-depr-props  
						
						... 
						
						
						
						Update deprecated doc.is_sentenced in Corpus 
						
					 
					
						2021-03-21 02:17:24 +01:00 
						 
				 
			
				
					
						
							
							
								Ines Montani 
							
						 
					 
					
						
						
							
							
						
						
						
							
						
						
							e3c3dbdb15 
							
						 
					 
					
						
						
							
							Merge pull request  #7492  from adrianeboyd/bugfix/ux-matcher-attributes  
						
						... 
						
						
						
						Update matcher errors and docs 
						
					 
					
						2021-03-21 02:17:13 +01:00 
						 
				 
			
				
					
						
							
							
								Adriane Boyd 
							
						 
					 
					
						
						
						
						
							
						
						
							0d2b723e8d 
							
						 
					 
					
						
						
							
							Update entity setting section  
						
						
						
					 
					
						2021-03-20 11:38:55 +01:00 
						 
				 
			
				
					
						
							
							
								Paul O'Leary McCann 
							
						 
					 
					
						
						
						
						
							
						
						
							e39c0dcf33 
							
						 
					 
					
						
						
							
							Fix mismatched backtick in Lexeme docs  
						
						
						
					 
					
						2021-03-20 18:40:00 +09:00 
						 
				 
			
				
					
						
							
							
								Adriane Boyd 
							
						 
					 
					
						
						
						
						
							
						
						
							39153ef90f 
							
						 
					 
					
						
						
							
							Update lexeme_norm checks  
						
						... 
						
						
						
						* Add util method for check
* Add new languages to list with lexeme norm tables
* Add check to all relevant components
* Add config details to warning message
Note that we're not actually inspecting the model config to see if
`NORM` is used as an attribute, so it may warn in cases where it's not
relevant. 
						
					 
					
						2021-03-19 10:59:27 +01:00 
						 
				 
			
				
					
						
							
							
								Adriane Boyd 
							
						 
					 
					
						
						
						
						
							
						
						
							c771ec22f0 
							
						 
					 
					
						
						
							
							Update matcher errors and docs  
						
						... 
						
						
						
						* Mention `tagger+attribute_ruler` in `POS`/`MORPH` error messages for
`Matcher` and `PhraseMatcher`
* Document `Matcher.__call__(allow_missing=)` 
						
					 
					
						2021-03-19 10:11:18 +01:00 
						 
				 
			
				
					
						
							
							
								Adriane Boyd 
							
						 
					 
					
						
						
						
						
							
						
						
							48b90c8e1c 
							
						 
					 
					
						
						
							
							Update deprecated doc.is_sentenced in Corpus  
						
						
						
					 
					
						2021-03-19 09:43:52 +01:00 
						 
				 
			
				
					
						
							
							
								Adriane Boyd 
							
						 
					 
					
						
						
							
							
						
						
						
							
						
						
							6a9a467766 
							
						 
					 
					
						
						
							
							Update website/docs/usage/processing-pipelines.md  
						
						... 
						
						
						
						Co-authored-by: Ines Montani <ines@ines.io> 
						
					 
					
						2021-03-19 08:12:49 +01:00 
						 
				 
			
				
					
						
							
							
								Ines Montani 
							
						 
					 
					
						
						
							
							
						
						
						
							
						
						
							34e13c1161 
							
						 
					 
					
						
						
							
							Merge pull request  #7472  from erre-quadro/universe/spikex  
						
						... 
						
						
						
						Add SpikeX to spaCy universe 
						
					 
					
						2021-03-19 02:08:36 +01:00 
						 
				 
			
				
					
						
							
							
								Ines Montani 
							
						 
					 
					
						
						
							
							
						
						
						
							
						
						
							4f9aaa2366 
							
						 
					 
					
						
						
							
							Merge pull request  #7451  from adrianeboyd/chore/add-py.typed  
						
						... 
						
						
						
						Add py.typed 
						
					 
					
						2021-03-19 02:08:16 +01:00 
						 
				 
			
				
					
						
							
							
								Ines Montani 
							
						 
					 
					
						
						
							
							
						
						
						
							
						
						
							66b900a76d 
							
						 
					 
					
						
						
							
							Merge pull request  #7440  from adrianeboyd/bugfix/ru-pymorph2-lookup-lemmatize  
						
						... 
						
						
						
						Rename and update Russian pymorphy2 lookup lemmatize 
						
					 
					
						2021-03-19 01:54:08 +01:00 
						 
				 
			
				
					
						
							
							
								Ines Montani 
							
						 
					 
					
						
						
							
							
						
						
						
							
						
						
							2c6fa8c890 
							
						 
					 
					
						
						
							
							Merge pull request  #7489  from adrianeboyd/bugfix/callbacks-entry-points  
						
						... 
						
						
						
						Check for callbacks entry points 
						
					 
					
						2021-03-19 01:53:53 +01:00 
						 
				 
			
				
					
						
							
							
								Ines Montani 
							
						 
					 
					
						
						
							
							
						
						
						
							
						
						
							b878bc74b9 
							
						 
					 
					
						
						
							
							Merge pull request  #7488  from Findus23/no-is-not  
						
						... 
						
						
						
						replace "is not" with != 
						
					 
					
						2021-03-19 01:53:38 +01:00 
						 
				 
			
				
					
						
							
							
								Adriane Boyd 
							
						 
					 
					
						
						
						
						
							
						
						
							0ad9e16ec3 
							
						 
					 
					
						
						
							
							Check for callbacks entry points  
						
						
						
					 
					
						2021-03-18 21:18:25 +01:00 
						 
				 
			
				
					
						
							
							
								Lukas Winkler 
							
						 
					 
					
						
						
							
							
						
						
						
							
						
						
							3c362ac520 
							
						 
					 
					
						
						
							
							replace "is not" with !=  
						
						
						
					 
					
						2021-03-18 21:09:11 +01:00 
						 
				 
			
				
					
						
							
							
								Adriane Boyd 
							
						 
					 
					
						
						
							
							
						
						
						
							
						
						
							6354b642c5 
							
						 
					 
					
						
						
							
							Fix typo  
						
						
						
					 
					
						2021-03-18 19:01:10 +01:00 
						 
				 
			
				
					
						
							
							
								Adriane Boyd 
							
						 
					 
					
						
						
						
						
							
						
						
							40e5d3a980 
							
						 
					 
					
						
						
							
							Update saving/loading example  
						
						
						
					 
					
						2021-03-18 16:56:10 +01:00 
						 
				 
			
				
					
						
							
							
								Adriane Boyd 
							
						 
					 
					
						
						
						
						
							
						
						
							0fb1881f36 
							
						 
					 
					
						
						
							
							Reformat processing pipelines  
						
						
						
					 
					
						2021-03-18 13:31:42 +01:00 
						 
				 
			
				
					
						
							
							
								Adriane Boyd 
							
						 
					 
					
						
						
						
						
							
						
						
							acc58719da 
							
						 
					 
					
						
						
							
							Update custom similarity hooks example  
						
						
						
					 
					
						2021-03-18 13:31:42 +01:00 
						 
				 
			
				
					
						
							
							
								Adriane Boyd 
							
						 
					 
					
						
						
						
						
							
						
						
							c9e1a9ac17 
							
						 
					 
					
						
						
							
							Add multiprocessing section  
						
						
						
					 
					
						2021-03-18 13:31:42 +01:00 
						 
				 
			
				
					
						
							
							
								Adriane Boyd 
							
						 
					 
					
						
						
						
						
							
						
						
							9a254d3995 
							
						 
					 
					
						
						
							
							Include all en_core_web_sm components in examples  
						
						
						
					 
					
						2021-03-18 13:31:42 +01:00 
						 
				 
			
				
					
						
							
							
								Adriane Boyd 
							
						 
					 
					
						
						
						
						
							
						
						
							83c1b919a7 
							
						 
					 
					
						
						
							
							Fix positional/option in CLI types  
						
						
						
					 
					
						2021-03-18 13:31:42 +01:00 
						 
				 
			
				
					
						
							
							
								Adriane Boyd 
							
						 
					 
					
						
						
						
						
							
						
						
							9fd41d6742 
							
						 
					 
					
						
						
							
							Remove Language.pipe cleanup arg  
						
						
						
					 
					
						2021-03-18 13:31:42 +01:00 
						 
				 
			
				
					
						
							
							
								Paul O'Leary McCann 
							
						 
					 
					
						
						
						
						
							
						
						
							40bc01e668 
							
						 
					 
					
						
						
							
							Proactively remove unused listeners  
						
						... 
						
						
						
						With this the changes in initialize.py might be unecessary.
Requires testing. 
						
					 
					
						2021-03-17 22:41:41 +09:00 
						 
				 
			
				
					
						
							
							
								Adriane Boyd 
							
						 
					 
					
						
						
							
							
						
						
						
							
						
						
							5da323fd86 
							
						 
					 
					
						
						
							
							Minor edits  
						
						
						
					 
					
						2021-03-17 12:59:05 +01:00 
						 
				 
			
				
					
						
							
							
								Adriane Boyd 
							
						 
					 
					
						
						
						
						
							
						
						
							a5ffe8dfed 
							
						 
					 
					
						
						
							
							Add details about pretrained pipeline design  
						
						
						
					 
					
						2021-03-17 11:31:26 +01:00 
						 
				 
			
				
					
						
							
							
								Paul O'Leary McCann 
							
						 
					 
					
						
						
						
						
							
						
						
							ef77c88638 
							
						 
					 
					
						
						
							
							Don't warn about components not in the pipeline  
						
						... 
						
						
						
						See here:
https://github.com/explosion/spaCy/discussions/7463 
Still need to check if there are any side effects of listeners being
present but not in the pipeline, but this commit will silence the
warnings. 
						
					 
					
						2021-03-17 14:56:04 +09:00