Sofie Van Landeghem 
							
						 
					 
					
						
						
							
							
						
						
						
							
						
						
							5ace559201 
							
						 
					 
					
						
						
							
							ensure span.text works for an empty span ( #6772 )  
						
						
						
					 
					
						2021-01-21 23:18:46 +08:00 
						 
				 
			
				
					
						
							
							
								Sofie Van Landeghem 
							
						 
					 
					
						
						
							
							
						
						
						
							
						
						
							2af31a8c8d 
							
						 
					 
					
						
						
							
							Bugfix textcat reproducibility on GPU ( #6411 )  
						
						... 
						
						
						
						* add seed argument to ParametricAttention layer
* bump thinc to 7.4.3
* set thinc version range
Co-authored-by: Adriane Boyd <adrianeboyd@gmail.com> 
						
					 
					
						2020-11-23 12:29:35 +01:00 
						 
				 
			
				
					
						
							
							
								Sofie Van Landeghem 
							
						 
					 
					
						
						
							
							
						
						
						
							
						
						
							2998131416 
							
						 
					 
					
						
						
							
							Reproducibility for TextCat and Tok2Vec ( #6218 )  
						
						... 
						
						
						
						* ensure fixed seed in HashEmbed layers
* forgot about the joys of python 2 
						
					 
					
						2020-10-08 00:43:46 +02:00 
						 
				 
			
				
					
						
							
							
								Florijan Stamenković 
							
						 
					 
					
						
						
							
							
						
						
						
							
						
						
							9db670b996 
							
						 
					 
					
						
						
							
							Fix Issue 6207 ( #6208 )  
						
						... 
						
						
						
						* Regression test for issue 6207
* Fix issue 6207
* Sign contributor agreement
* Minor adjustments to test
Co-authored-by: Adriane Boyd <adrianeboyd@gmail.com> 
						
					 
					
						2020-10-06 11:17:37 +02:00 
						 
				 
			
				
					
						
							
							
								Sofie Van Landeghem 
							
						 
					 
					
						
						
							
							
						
						
						
							
						
						
							f7a25d69f7 
							
						 
					 
					
						
						
							
							Bugfix in merge_entities ( #6005 )  
						
						... 
						
						
						
						* failing test
* bugfix 
						
					 
					
						2020-09-01 21:57:52 +02:00 
						 
				 
			
				
					
						
							
							
								Sofie Van Landeghem 
							
						 
					 
					
						
						
							
							
						
						
						
							
						
						
							071c09ff35 
							
						 
					 
					
						
						
							
							add coding ( #5942 )  
						
						
						
					 
					
						2020-08-20 11:08:38 +02:00 
						 
				 
			
				
					
						
							
							
								Gustavo Zadrozny Leyendecker 
							
						 
					 
					
						
						
							
							
						
						
						
							
						
						
							90b958fd01 
							
						 
					 
					
						
						
							
							Fix on EntityRendered to support break lines (after last entity) ( closes   #5838 )  
						
						
						
					 
					
						2020-07-29 18:48:39 +02:00 
						 
				 
			
				
					
						
							
							
								Adriane Boyd 
							
						 
					 
					
						
						
							
							
						
						
						
							
						
						
							0a62098c5f 
							
						 
					 
					
						
						
							
							Fix lemmatizer is_base_form for python2.7 ( #5734 )  
						
						... 
						
						
						
						* Fix lemmatizer init args for python2.7
* Move English is_base_form to a class method
* Skip test pickling PhraseMatcher for python2 
						
					 
					
						2020-07-09 22:11:24 +02:00 
						 
				 
			
				
					
						
							
							
								graue70 
							
						 
					 
					
						
						
							
							
						
						
						
							
						
						
							9860b8399e 
							
						 
					 
					
						
						
							
							Fix typo in test function docstring ( #5696 )  
						
						
						
					 
					
						2020-07-05 15:49:06 +02:00 
						 
				 
			
				
					
						
							
							
								Adriane Boyd 
							
						 
					 
					
						
						
							
							
						
						
						
							
						
						
							167df42cb6 
							
						 
					 
					
						
						
							
							Move lemmatizer is_base_form to language settings ( #5663 )  
						
						... 
						
						
						
						Move `Lemmatizer.is_base_form` to the language settings so that each
language can provide a language-specific method as
`LanguageDefaults.is_base_form`.
The existing English-specific `Lemmatizer.is_base_form` is moved to
`EnglishDefaults`. 
						
					 
					
						2020-06-29 14:16:57 +02:00 
						 
				 
			
				
					
						
							
							
								Ines Montani 
							
						 
					 
					
						
						
						
						
							
						
						
							c685ee734a 
							
						 
					 
					
						
						
							
							Fix compat for v2.x branch  
						
						
						
					 
					
						2020-05-22 14:22:36 +02:00 
						 
				 
			
				
					
						
							
							
								Matthew Honnibal 
							
						 
					 
					
						
						
							
							
						
						
						
							
						
						
							93c4d13588 
							
						 
					 
					
						
						
							
							Merge pull request  #5264  from lfiedler/issue-5230  
						
						... 
						
						
						
						Fix ResourceWarnings during unittest 
						
					 
					
						2020-05-22 00:31:07 +02:00 
						 
				 
			
				
					
						
							
							
								svlandeg 
							
						 
					 
					
						
						
						
						
							
						
						
							36a94c409a 
							
						 
					 
					
						
						
							
							failing test to reproduce overlapping spans problem  
						
						
						
					 
					
						2020-05-20 23:06:03 +02:00 
						 
				 
			
				
					
						
							
							
								adrianeboyd 
							
						 
					 
					
						
						
							
							
						
						
						
							
						
						
							40e65d6f63 
							
						 
					 
					
						
						
							
							Fix most_similar for vectors with unused rows ( #5348 )  
						
						... 
						
						
						
						* Fix most_similar for vectors with unused rows
Address issues related to the unused rows in the vector table and
`most_similar`:
* Update `most_similar()` to search only through rows that are in use
according to `key2row`.
* Raise an error when `most_similar(n=n)` is larger than the number of
vectors in the table.
* Set and restore `_unset` correctly when vectors are added or
deserialized so that new vectors are added in the correct row.
* Set data and keys to the same length in `Vocab.prune_vectors()` to
avoid spurious entries in `key2row`.
* Fix regression test using `most_similar`
Co-authored-by: Matthew Honnibal <honnibal+gh@gmail.com> 
						
					 
					
						2020-05-19 16:41:26 +02:00 
						 
				 
			
				
					
						
							
							
								Sofie Van Landeghem 
							
						 
					 
					
						
						
							
							
						
						
						
							
						
						
							cfdaf99b80 
							
						 
					 
					
						
						
							
							Fix passing of component configuration ( #5374 )  
						
						... 
						
						
						
						* add kwargs to to_disk methods in docs - otherwise crashes on 'exclude' argument
* add fix and test for Issue 5137 
						
					 
					
						2020-04-29 12:56:17 +02:00 
						 
				 
			
				
					
						
							
							
								Sofie Van Landeghem 
							
						 
					 
					
						
						
							
							
						
						
						
							
						
						
							f67343295d 
							
						 
					 
					
						
						
							
							Update NEL examples and documentation ( #5370 )  
						
						... 
						
						
						
						* simplify creation of KB by skipping dim reduction
* small fixes to train EL example script
* add KB creation and NEL training example scripts to example section
* update descriptions of example scripts in the documentation
* moving wiki_entity_linking folder from bin to projects
* remove test for wiki NEL functionality that is being moved 
						
					 
					
						2020-04-29 12:53:53 +02:00 
						 
				 
			
				
					
						
							
							
								adrianeboyd 
							
						 
					 
					
						
						
							
							
						
						
						
							
						
						
							f8ac5b9f56 
							
						 
					 
					
						
						
							
							bugfix in span similarity ( #5155 ) ( #5358 )  
						
						... 
						
						
						
						* bugfix in span similarity
* also rewrite doc.pyx for clarity
* formatting
Co-authored-by: Sofie Van Landeghem <svlandeg@users.noreply.github.com> 
						
					 
					
						2020-04-27 16:51:27 +02:00 
						 
				 
			
				
					
						
							
							
								Jakob Jul Elben 
							
						 
					 
					
						
						
							
							
						
						
						
							
						
						
							663333c3b2 
							
						 
					 
					
						
						
							
							Fixes   #5413  ( #5315 )  
						
						... 
						
						
						
						* Fix 5314
* Add contributor
* Resolve requested changes
Co-authored-by: Jakob Jul Elben <jakob@datamaga.com> 
						
					 
					
						2020-04-16 13:29:02 +02:00 
						 
				 
			
				
					
						
							
							
								Leander Fiedler 
							
						 
					 
					
						
						
						
						
							
						
						
							d60e2d3ebf 
							
						 
					 
					
						
						
							
							issue5230 added unit test for dumping and loading knowledgebase  
						
						
						
					 
					
						2020-04-12 09:08:41 +02:00 
						 
				 
			
				
					
						
							
							
								Leander Fiedler 
							
						 
					 
					
						
						
						
						
							
						
						
							d2bb649227 
							
						 
					 
					
						
						
							
							issue5230 filter warnings in addition to filterwarnings to prevent deprecation warnings in python35(win) setup to pop up  
						
						
						
					 
					
						2020-04-10 23:21:13 +02:00 
						 
				 
			
				
					
						
							
							
								Leander Fiedler 
							
						 
					 
					
						
						
						
						
							
						
						
							ca2a7a44db 
							
						 
					 
					
						
						
							
							issue5230 store string values of warnings to remotely debug failing python35(win) setup  
						
						
						
					 
					
						2020-04-10 22:26:55 +02:00 
						 
				 
			
				
					
						
							
							
								Leander Fiedler 
							
						 
					 
					
						
						
						
						
							
						
						
							88ca40a15d 
							
						 
					 
					
						
						
							
							issue5230 raise warnings as errors to remotely debug failing python35(win) setup  
						
						
						
					 
					
						2020-04-10 21:45:53 +02:00 
						 
				 
			
				
					
						
							
							
								Leander Fiedler 
							
						 
					 
					
						
						
						
						
							
						
						
							a7bdfe42e1 
							
						 
					 
					
						
						
							
							issue5230 added print statement to warnings filter to remotely debug failing python35(win) setup  
						
						
						
					 
					
						2020-04-10 21:14:33 +02:00 
						 
				 
			
				
					
						
							
							
								Leander Fiedler 
							
						 
					 
					
						
						
						
						
							
						
						
							8c1d0d628f 
							
						 
					 
					
						
						
							
							issue5230 writer now checks instance of loc parameter before trying to operate on it  
						
						
						
					 
					
						2020-04-10 20:35:52 +02:00 
						 
				 
			
				
					
						
							
							
								lfiedler 
							
						 
					 
					
						
						
						
						
							
						
						
							e1e25c7e30 
							
						 
					 
					
						
						
							
							issue5230: added unittest test case for completion  
						
						
						
					 
					
						2020-04-06 21:36:02 +02:00 
						 
				 
			
				
					
						
							
							
								Leander Fiedler 
							
						 
					 
					
						
						
						
						
							
						
						
							cde96f6c64 
							
						 
					 
					
						
						
							
							issue5230: optimized unit test a bit  
						
						
						
					 
					
						2020-04-06 20:51:12 +02:00 
						 
				 
			
				
					
						
							
							
								Leander Fiedler 
							
						 
					 
					
						
						
						
						
							
						
						
							71cc903d65 
							
						 
					 
					
						
						
							
							issue5230: replaced open statements on path objects so that serialization still works an files are closed  
						
						
						
					 
					
						2020-04-06 20:30:41 +02:00 
						 
				 
			
				
					
						
							
							
								Leander Fiedler 
							
						 
					 
					
						
						
						
						
							
						
						
							273ed452bb 
							
						 
					 
					
						
						
							
							issue5230: added unicode declaration at top of the file  
						
						
						
					 
					
						2020-04-06 19:22:32 +02:00 
						 
				 
			
				
					
						
							
							
								Leander Fiedler 
							
						 
					 
					
						
						
						
						
							
						
						
							1cd975d4a5 
							
						 
					 
					
						
						
							
							issue5230: fixed resource warnings in language  
						
						
						
					 
					
						2020-04-06 18:54:32 +02:00 
						 
				 
			
				
					
						
							
							
								Leander Fiedler 
							
						 
					 
					
						
						
						
						
							
						
						
							493c77462a 
							
						 
					 
					
						
						
							
							issue5230: test cases  
						
						... 
						
						
						
						covering known sources of resource warnings 
						
					 
					
						2020-04-06 18:46:51 +02:00 
						 
				 
			
				
					
						
							
							
								Ines Montani 
							
						 
					 
					
						
						
						
						
							
						
						
							828acffc12 
							
						 
					 
					
						
						
							
							Tidy up and auto-format  
						
						
						
					 
					
						2020-03-25 12:28:12 +01:00 
						 
				 
			
				
					
						
							
							
								Sofie Van Landeghem 
							
						 
					 
					
						
						
							
							
						
						
						
							
						
						
							1a2b8fc264 
							
						 
					 
					
						
						
							
							set vector of merged entity ( #5085 )  
						
						... 
						
						
						
						* merge_entities sets the vector in the vocab for the merged token
* add unit test
* import unicode_literals
* move code to _merge function
* only set vector if vocab has non-zero vectors 
						
					 
					
						2020-03-06 14:45:28 +01:00 
						 
				 
			
				
					
						
							
							
								Sofie Van Landeghem 
							
						 
					 
					
						
						
							
							
						
						
						
							
						
						
							d307e9ca58 
							
						 
					 
					
						
						
							
							take care of global vectors in multiprocessing ( #5081 )  
						
						... 
						
						
						
						* restore load_nlp.VECTORS in the child process
* add unit test
* fix test
* remove unnecessary import
* add utf8 encoding
* import unicode_literals 
						
					 
					
						2020-03-03 13:58:22 +01:00 
						 
				 
			
				
					
						
							
							
								Sofie Van Landeghem 
							
						 
					 
					
						
						
							
							
						
						
						
							
						
						
							c6b12ab02a 
							
						 
					 
					
						
						
							
							Bugfix/get doc ( #5049 )  
						
						... 
						
						
						
						* new (broken) unit test
* fixing get_doc method 
						
					 
					
						2020-03-02 11:49:28 +01:00 
						 
				 
			
				
					
						
							
							
								svlandeg 
							
						 
					 
					
						
						
						
						
							
						
						
							b49a3afd0c 
							
						 
					 
					
						
						
							
							use clean_underscore fixture  
						
						
						
					 
					
						2020-02-23 15:49:20 +01:00 
						 
				 
			
				
					
						
							
							
								svlandeg 
							
						 
					 
					
						
						
						
						
							
						
						
							6e717c62ed 
							
						 
					 
					
						
						
							
							avoid the tests interacting with eachother through the global Underscore variable  
						
						
						
					 
					
						2020-02-12 13:21:31 +01:00 
						 
				 
			
				
					
						
							
							
								svlandeg 
							
						 
					 
					
						
						
						
						
							
						
						
							7939c63886 
							
						 
					 
					
						
						
							
							use English instead of model  
						
						
						
					 
					
						2020-02-12 12:26:27 +01:00 
						 
				 
			
				
					
						
							
							
								svlandeg 
							
						 
					 
					
						
						
						
						
							
						
						
							46628d8890 
							
						 
					 
					
						
						
							
							add some asserts  
						
						
						
					 
					
						2020-02-12 12:12:52 +01:00 
						 
				 
			
				
					
						
							
							
								svlandeg 
							
						 
					 
					
						
						
						
						
							
						
						
							51d37033c8 
							
						 
					 
					
						
						
							
							remove old comment  
						
						
						
					 
					
						2020-02-12 12:10:05 +01:00 
						 
				 
			
				
					
						
							
							
								svlandeg 
							
						 
					 
					
						
						
						
						
							
						
						
							05dedaa2cf 
							
						 
					 
					
						
						
							
							add unit test  
						
						
						
					 
					
						2020-02-12 12:00:13 +01:00 
						 
				 
			
				
					
						
							
							
								Tyler Couto 
							
						 
					 
					
						
						
							
							
						
						
						
							
						
						
							9fa9d7f2cb 
							
						 
					 
					
						
						
							
							Fix for Issue 4665 - conllu2json ( #4953 )  
						
						... 
						
						
						
						* Fix for Issue 4665 - conllu2json
- Allowing HEAD to be an underscore
* Added contributor agreement 
						
					 
					
						2020-02-03 13:01:48 +01:00 
						 
				 
			
				
					
						
							
							
								Yohei Tamura 
							
						 
					 
					
						
						
						
						
							
						
						
							708a4d27eb 
							
						 
					 
					
						
						
							
							fix nlp.evaluate ( #4924 ) ( #4925 )  
						
						... 
						
						
						
						* new file:   test_issue4924.py
* modified:   spacy/gold.pyx
* modified:   test_issue4924.py for python2 
						
					 
					
						2020-01-20 12:17:46 +01:00 
						 
				 
			
				
					
						
							
							
								Sofie Van Landeghem 
							
						 
					 
					
						
						
						
						
							
						
						
							a1b22e90cd 
							
						 
					 
					
						
						
							
							serialize ENT_ID ( #4852 )  
						
						... 
						
						
						
						* expand serialization test for custom token attribute
* add failing test for issue 4849
* define ENT_ID as attr and use in doc serialization
* fix few typos 
						
					 
					
						2020-01-06 14:57:34 +01:00 
						 
				 
			
				
					
						
							
							
								Ines Montani 
							
						 
					 
					
						
						
						
						
							
						
						
							3431ac42de 
							
						 
					 
					
						
						
							
							Fix typo  
						
						
						
					 
					
						2019-12-21 21:17:45 +01:00 
						 
				 
			
				
					
						
							
							
								Ines Montani 
							
						 
					 
					
						
						
						
						
							
						
						
							7c69d30de5 
							
						 
					 
					
						
						
							
							Tidy up and expect warning  
						
						
						
					 
					
						2019-12-21 21:14:52 +01:00 
						 
				 
			
				
					
						
							
							
								Ines Montani 
							
						 
					 
					
						
						
						
						
							
						
						
							cb4145adc7 
							
						 
					 
					
						
						
							
							Tidy up and auto-format  
						
						
						
					 
					
						2019-12-21 19:04:17 +01:00 
						 
				 
			
				
					
						
							
							
								Sofie Van Landeghem 
							
						 
					 
					
						
						
						
						
							
						
						
							f9b541f9ef 
							
						 
					 
					
						
						
							
							More robust set entities method in KB ( #4794 )  
						
						... 
						
						
						
						* add unit test for setting entities with duplicate identifiers
* count the number of actual unique identifiers and throw duplicate warning 
						
					 
					
						2019-12-13 10:45:29 +01:00 
						 
				 
			
				
					
						
							
							
								Ines Montani 
							
						 
					 
					
						
						
						
						
							
						
						
							5b36dec7eb 
							
						 
					 
					
						
						
							
							Auto-exclude disabled when calling from_disk during load ( #4708 )  
						
						
						
					 
					
						2019-11-25 16:01:22 +01:00 
						 
				 
			
				
					
						
							
							
								Ines Montani 
							
						 
					 
					
						
						
						
						
							
						
						
							5d4eede1e4 
							
						 
					 
					
						
						
							
							Fix test util imports  
						
						
						
					 
					
						2019-11-21 16:28:29 +01:00 
						 
				 
			
				
					
						
							
							
								GuiGel 
							
						 
					 
					
						
						
						
						
							
						
						
							8f7ab70870 
							
						 
					 
					
						
						
							
							Bugfix/fix entity ruler from disk ( #4670 )  
						
						... 
						
						
						
						* fix EntityRuler from_disk bug
* add contributor file
* Test EntityRuler PhraseMatcher deserialization (#4651 )
* newline at end of file
* fix copy paste error
* serializing the EntityRuler by itself
* Add unicode declarations for Python 2 and auto-format 
						
					 
					
						2019-11-21 16:26:37 +01:00