Peter Gilles 
							
						 
					 
					
						
						
						
						
							
						
						
							428887b8f2 
							
						 
					 
					
						
						
							
							Initial commit: New language Luxembourgish (lb) ( #4424 )  
						
						... 
						
						
						
						* new language: Luxembourgish (lb)
* update
* update
* Update and rename .github/CONTRIBUTOR_AGREEMENT.md to .github/contributors/PeterGilles.md
* Update and rename .github/contributors/PeterGilles.md to .github/CONTRIBUTOR_AGREEMENT.md
* Update norm_exceptions.py
* Delete README.md
* moved test_lemma.py
* deactivated 'lemma_lookup = LOOKUP'
* update
* Update conftest.py
* update
* tests updated
* import unicode_literals
* Update spacy/tests/lang/lb/test_text.py
Co-Authored-By: Ines Montani <ines@ines.io>
* Create PeterGilles.md 
						
					 
					
						2019-10-14 12:27:50 +02:00 
						 
				 
			
				
					
						
							
							
								adrianeboyd 
							
						 
					 
					
						
						
						
						
							
						
						
							98a961a60e 
							
						 
					 
					
						
						
							
							Fix PhraseMatcher.remove for overlapping patterns ( #4437 )  
						
						
						
					 
					
						2019-10-14 12:19:51 +02:00 
						 
				 
			
				
					
						
							
							
								Ines Montani 
							
						 
					 
					
						
						
						
						
							
						
						
							f8f68bb062 
							
						 
					 
					
						
						
							
							Auto-format [ci skip]  
						
						
						
					 
					
						2019-10-10 17:08:39 +02:00 
						 
				 
			
				
					
						
							
							
								adrianeboyd 
							
						 
					 
					
						
						
						
						
							
						
						
							d2d2baaf76 
							
						 
					 
					
						
						
							
							Revert training example edit from  #4327  ( #4403 )  
						
						... 
						
						
						
						I think the original annotation was correct and this change also
unfortunately introduced a cycle into the dependency tree. 
						
					 
					
						2019-10-10 17:00:26 +02:00 
						 
				 
			
				
					
						
							
							
								adrianeboyd 
							
						 
					 
					
						
						
						
						
							
						
						
							6f54e59fe7 
							
						 
					 
					
						
						
							
							Fix util.filter_spans() to prefer first span in overlapping sam… ( #4414 )  
						
						... 
						
						
						
						* Update util.filter_spans() to prefer earlier spans
* Add filter_spans test for first same-length span
* Update entity relation example to refer to util.filter_spans() 
						
					 
					
						2019-10-10 17:00:03 +02:00 
						 
				 
			
				
					
						
							
							
								Sofie Van Landeghem 
							
						 
					 
					
						
						
						
						
							
						
						
							da6e0de34f 
							
						 
					 
					
						
						
							
							fix attrs field in the matcher ( #4423 )  
						
						... 
						
						
						
						* raise specific error when removing a matcher rule that doesn't exist
* rephrasing
* ensure attrs is NULL when nr_attr == 0 + several fixes to prevent OOB 
						
					 
					
						2019-10-10 15:20:59 +02:00 
						 
				 
			
				
					
						
							
							
								Sofie Van Landeghem 
							
						 
					 
					
						
						
						
						
							
						
						
							5efae495f1 
							
						 
					 
					
						
						
							
							Error when removing a matcher rule that doesn't exist ( #4420 )  
						
						... 
						
						
						
						* raise specific error when removing a matcher rule that doesn't exist
* rephrasing 
						
					 
					
						2019-10-10 14:01:53 +02:00 
						 
				 
			
				
					
						
							
							
								Matthew Honnibal 
							
						 
					 
					
						
						
							
							
						
						
						
							
						
						
							fa95c030a5 
							
						 
					 
					
						
						
							
							Unify matcher get_ent_id and get_pattern_key ( #4415 )  
						
						... 
						
						
						
						This is basically stabbing blindly at the ghost match problem, but it at
least seems like there was a bug previously here --- so this should
hopefully be an improvement, even if it doesn't fix the ghost match
problem. 
						
					 
					
						2019-10-09 15:26:31 +02:00 
						 
				 
			
				
					
						
							
							
								Ines Montani 
							
						 
					 
					
						
						
						
						
							
						
						
							77643de2ca 
							
						 
					 
					
						
						
							
							Downgrade importlib_metadata requirement  
						
						
						
					 
					
						2019-10-08 23:43:24 +02:00 
						 
				 
			
				
					
						
							
							
								Ines Montani 
							
						 
					 
					
						
						
						
						
							
						
						
							5cbe21700b 
							
						 
					 
					
						
						
							
							Only show label scheme if not empty [ci skip]  
						
						
						
					 
					
						2019-10-08 15:52:59 +02:00 
						 
				 
			
				
					
						
							
							
								Ines Montani 
							
						 
					 
					
						
						
						
						
							
						
						
							8f76d6c9ef 
							
						 
					 
					
						
						
							
							Update transformer model details [ci skip]  
						
						
						
					 
					
						2019-10-08 15:39:38 +02:00 
						 
				 
			
				
					
						
							
							
								Ines Montani 
							
						 
					 
					
						
						
						
						
							
						
						
							dd30d3ec99 
							
						 
					 
					
						
						
							
							Add setuptools as runtime dependency  
						
						
						
					 
					
						2019-10-08 12:46:59 +02:00 
						 
				 
			
				
					
						
							
							
								Ines Montani 
							
						 
					 
					
						
						
						
						
							
						
						
							c4f95c1569 
							
						 
					 
					
						
						
							
							Update formatting and docstrings [ci skip]  
						
						
						
					 
					
						2019-10-08 12:25:23 +02:00 
						 
				 
			
				
					
						
							
							
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							ddd6fda59c 
							
						 
					 
					
						
						
							
							Add registry for model creation functions ('architectures') ( #4395 )  
						
						... 
						
						
						
						* Add architecture registry
* Add test for arch registry
* Add error for model architectures 
						
					 
					
						2019-10-08 12:21:03 +02:00 
						 
				 
			
				
					
						
							
							
								tamuhey 
							
						 
					 
					
						
						
						
						
							
						
						
							650cbfe82d 
							
						 
					 
					
						
						
							
							multiprocessing pipe ( #1303 ) ( #4371 )  
						
						... 
						
						
						
						* refactor: separate formatting docs and golds in Language.update
* fix return typo
* add pipe test
* unpickleable object cannot be assigned to p.map
* passed test pipe
* passed test!
* pipe terminate
* try pipe
* passed test
* fix ch
* add comments
* fix len(texts)
* add comment
* add comment
* fix: multiprocessing of pipe is not supported in 2
* test: use assert_docs_equal
* fix: is_python3 -> is_python2
* fix: change _pipe arg to use functools.partial
* test: add vector modification test
* test: add sample ner_pipe and user_data pipe
* add warnings test
* test: fix user warnings
* test: fix warnings capture
* fix: remove islice import
* test: remove warnings test
* test: add stream test
* test: rename
* fix: multiproc stream
* fix: stream pipe
* add comment
* mp.Pipe seems to be able to use with relative small data
* test: skip stream test in python2
* sort imports
* test: add reason to skiptest
* fix: use pipe for docs communucation
* add comments
* add comment 
						
					 
					
						2019-10-08 12:20:55 +02:00 
						 
				 
			
				
					
						
							
							
								adrianeboyd 
							
						 
					 
					
						
						
						
						
							
						
						
							14841d0aa6 
							
						 
					 
					
						
						
							
							Fix PhraseMatcher callback and add tests ( #4399 )  
						
						... 
						
						
						
						* Fix callback lookup in PhraseMatcher (string key rather than hash key)
* Add callback tests for Matcher and PhraseMatcher 
						
					 
					
						2019-10-08 12:07:02 +02:00 
						 
				 
			
				
					
						
							
							
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							fd4a5341b0 
							
						 
					 
					
						
						
							
							Fix ner_jsonl2json converter ( fix   #4389 ) ( #4394 )  
						
						
						
					 
					
						2019-10-08 00:52:45 +02:00 
						 
				 
			
				
					
						
							
							
								Matthew Honnibal 
							
						 
					 
					
						
						
							
							
						
						
						
							
						
						
							29f9fec267 
							
						 
					 
					
						
						
							
							Improve spacy pretrain ( #4393 )  
						
						... 
						
						
						
						* Support bilstm_depth arg in spacy pretrain
* Add option to ignore zero vectors in get_cossim_loss
* Use cosine loss in Cloze multitask 
						
					 
					
						2019-10-07 23:34:58 +02:00 
						 
				 
			
				
					
						
							
							
								Ines Montani 
							
						 
					 
					
						
						
						
						
							
						
						
							9cd6ca3e4d 
							
						 
					 
					
						
						
							
							Improve usage of pkg_resources and handling of entry points ( #4387 )  
						
						... 
						
						
						
						* Only import pkg_resources where it's needed
Apparently it's really slow
* Use importlib_metadata for entry points
* Revert "Only import pkg_resources where it's needed"
This reverts commit 5ed8c03afa8b30b579579f071f5c4002e12a17ec 
						
					 
					
						2019-10-07 17:22:09 +02:00 
						 
				 
			
				
					
						
							
							
								adrianeboyd 
							
						 
					 
					
						
						
						
						
							
						
						
							d53a8d9313 
							
						 
					 
					
						
						
							
							Consider batch_size when sorting similar vectors ( #4388 )  
						
						
						
					 
					
						2019-10-07 13:38:35 +02:00 
						 
				 
			
				
					
						
							
							
								adrianeboyd 
							
						 
					 
					
						
						
						
						
							
						
						
							a3509f67d4 
							
						 
					 
					
						
						
							
							Extend unicode character block for Sinhala ( #4378 )  
						
						... 
						
						
						
						* Extend unicode character block for Sinhala
* Add sentencizer tests for more languages 
						
					 
					
						2019-10-07 13:17:03 +02:00 
						 
				 
			
				
					
						
							
							
								Ines Montani 
							
						 
					 
					
						
						
						
						
							
						
						
							573e543e4a 
							
						 
					 
					
						
						
							
							Alphanumeric -> alphabetic [ci skip]  
						
						... 
						
						
						
						see ines/spacy-course#38  
						
					 
					
						2019-10-06 13:30:01 +02:00 
						 
				 
			
				
					
						
							
							
								adrianeboyd 
							
						 
					 
					
						
						
						
						
							
						
						
							cbc2cee2c8 
							
						 
					 
					
						
						
							
							Improve URL_PATTERN and handling in tokenizer ( #4374 )  
						
						... 
						
						
						
						* Move prefix and suffix detection for URL_PATTERN
Move prefix and suffix detection for `URL_PATTERN` into the tokenizer.
Remove associated lookahead and lookbehind from `URL_PATTERN`.
Fix tokenization for Hungarian given new modified handling of prefixes
and suffixes.
* Match a wider range of URI schemes 
						
					 
					
						2019-10-05 13:00:09 +02:00 
						 
				 
			
				
					
						
							
							
								Ines Montani 
							
						 
					 
					
						
						
						
						
							
						
						
							e65dffd80b 
							
						 
					 
					
						
						
							
							Clarify serialization of extension attributes ( closes   #4377 ) [ci skip]  
						
						
						
					 
					
						2019-10-05 11:58:00 +02:00 
						 
				 
			
				
					
						
							
							
								Ines Montani 
							
						 
					 
					
						
						
						
						
							
						
						
							fec9433044 
							
						 
					 
					
						
						
							
							Make PhraseMatcher.vocab consistent with Matcher.vocab ( closes   #4373 )  
						
						
						
					 
					
						2019-10-04 12:18:41 +02:00 
						 
				 
			
				
					
						
							
							
								Ines Montani 
							
						 
					 
					
						
						
						
						
							
						
						
							e7ddc6f662 
							
						 
					 
					
						
						
							
							Add conda install for lookups [ci skip]  
						
						
						
					 
					
						2019-10-03 17:52:53 +02:00 
						 
				 
			
				
					
						
							
							
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							37ef874d8b 
							
						 
					 
					
						
						
							
							Set version to v2.2.1  
						
						
						
					 
					
						2019-10-03 14:50:39 +02:00 
						 
				 
			
				
					
						
							
							
								Sofie Van Landeghem 
							
						 
					 
					
						
						
						
						
							
						
						
							4e7259c6cf 
							
						 
					 
					
						
						
							
							Bugfix initializing DocBin with attributes ( #4368 )  
						
						... 
						
						
						
						* docbin init fix + documentation fix + unit tests
* newline
* try with zlib instead of gzip (python 2 incompatibilities) 
						
					 
					
						2019-10-03 14:48:45 +02:00 
						 
				 
			
				
					
						
							
							
								Ines Montani 
							
						 
					 
					
						
						
						
						
							
						
						
							ce1d441de5 
							
						 
					 
					
						
						
							
							Add docs for Vectors.most_similar [ci skip]  
						
						
						
					 
					
						2019-10-03 14:29:47 +02:00 
						 
				 
			
				
					
						
							
							
								Ben Taylor 
							
						 
					 
					
						
						
						
						
							
						
						
							1db79a33cb 
							
						 
					 
					
						
						
							
							most_similar() return the k most similar vectors ( #4364 )  
						
						... 
						
						
						
						* most_similar return n-most similar vectors
* updated most_similar comment
* add bintay contributor agreement
* sign bintay contributor agreement
* fix most_similar documentation typo
* fixed error in prune_vectors
* updated prune_vectors test 
						
					 
					
						2019-10-03 14:09:44 +02:00 
						 
				 
			
				
					
						
							
							
								Ines Montani 
							
						 
					 
					
						
						
						
						
							
						
						
							4159936720 
							
						 
					 
					
						
						
							
							Update README.md [ci skip]  
						
						
						
					 
					
						2019-10-02 19:15:22 +02:00 
						 
				 
			
				
					
						
							
							
								Ines Montani 
							
						 
					 
					
						
						
						
						
							
						
						
							e4782feae9 
							
						 
					 
					
						
						
							
							Update README.md [ci skip]  
						
						
						
					 
					
						2019-10-02 18:49:55 +02:00 
						 
				 
			
				
					
						
							
							
								Ines Montani 
							
						 
					 
					
						
						
						
						
							
						
						
							80cf385f65 
							
						 
					 
					
						
						
							
							Update v2-2.md [ci skip]  
						
						
						
					 
					
						2019-10-02 16:58:21 +02:00 
						 
				 
			
				
					
						
							
							
								Ines Montani 
							
						 
					 
					
						
						
						
						
							
						
						
							f8e606c303 
							
						 
					 
					
						
						
							
							Update README.md [ci skip]  
						
						
						
					 
					
						2019-10-02 16:47:10 +02:00 
						 
				 
			
				
					
						
							
							
								Ines Montani 
							
						 
					 
					
						
						
						
						
							
						
						
							12a941d841 
							
						 
					 
					
						
						
							
							Update binder version [ci skip]  
						
						
						
					 
					
						2019-10-02 16:47:01 +02:00 
						 
				 
			
				
					
						
							
							
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							2eb31012e7 
							
						 
					 
					
						
						
							
							Set version to v2.2.0  
						
						
						
					 
					
						2019-10-02 14:40:06 +02:00 
						 
				 
			
				
					
						
							
							
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							796072e560 
							
						 
					 
					
						
						
							
							Set version to v2.2.0.dev19  
						
						
						
					 
					
						2019-10-02 12:51:29 +02:00 
						 
				 
			
				
					
						
							
							
								Sofie Van Landeghem 
							
						 
					 
					
						
						
						
						
							
						
						
							9d3ce7cba2 
							
						 
					 
					
						
						
							
							Ensure training doesn't crash with empty batches ( #4360 )  
						
						... 
						
						
						
						* unit test for previously resolved unflatten issue
* prevent batch of empty docs to cause problems 
						
					 
					
						2019-10-02 12:50:47 +02:00 
						 
				 
			
				
					
						
							
							
								Ines Montani 
							
						 
					 
					
						
						
						
						
							
						
						
							52b5912dbf 
							
						 
					 
					
						
						
							
							Tidy up [ci skip]  
						
						
						
					 
					
						2019-10-02 12:05:59 +02:00 
						 
				 
			
				
					
						
							
							
								adrianeboyd 
							
						 
					 
					
						
						
						
						
							
						
						
							d82241218a 
							
						 
					 
					
						
						
							
							Make the default NER labels less model-specific [ci skip] ( #4361 )  
						
						
						
					 
					
						2019-10-02 12:05:17 +02:00 
						 
				 
			
				
					
						
							
							
								adrianeboyd 
							
						 
					 
					
						
						
						
						
							
						
						
							dda86118bd 
							
						 
					 
					
						
						
							
							Update Ukrainian lemmatizer with new lookups ( #4359 )  
						
						... 
						
						
						
						* Update Ukrainian lemmatizer with new lookups
* Add missing import
Co-authored-by: Ines Montani <ines@ines.io> 
						
					 
					
						2019-10-02 12:04:06 +02:00 
						 
				 
			
				
					
						
							
							
								Ines Montani 
							
						 
					 
					
						
						
						
						
							
						
						
							b6670bf0c2 
							
						 
					 
					
						
						
							
							Use consistent spelling  
						
						
						
					 
					
						2019-10-02 10:37:39 +02:00 
						 
				 
			
				
					
						
							
							
								Ines Montani 
							
						 
					 
					
						
						
						
						
							
						
						
							208629615d 
							
						 
					 
					
						
						
							
							Auto-format  
						
						
						
					 
					
						2019-10-02 10:37:04 +02:00 
						 
				 
			
				
					
						
							
							
								Ines Montani 
							
						 
					 
					
						
						
						
						
							
						
						
							867e93aae2 
							
						 
					 
					
						
						
							
							Add Streamlit example [ci skip]  
						
						
						
					 
					
						2019-10-02 01:21:20 +02:00 
						 
				 
			
				
					
						
							
							
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							38b6e69389 
							
						 
					 
					
						
						
							
							Merge branch 'master' of  https://github.com/explosion/spaCy  
						
						
						
					 
					
						2019-10-01 22:28:25 +02:00 
						 
				 
			
				
					
						
							
							
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							d4b63bb6dd 
							
						 
					 
					
						
						
							
							Set version to v2.2.0  
						
						
						
					 
					
						2019-10-01 22:28:13 +02:00 
						 
				 
			
				
					
						
							
							
								Ines Montani 
							
						 
					 
					
						
						
						
						
							
						
						
							9885b5ae68 
							
						 
					 
					
						
						
							
							Update spacy_lookups_data version [ci skip]  
						
						
						
					 
					
						2019-10-01 22:21:21 +02:00 
						 
				 
			
				
					
						
							
							
								Ines Montani 
							
						 
					 
					
						
						
						
						
							
						
						
							475e3188ce 
							
						 
					 
					
						
						
							
							Add docs on filtering overlapping spans for merging ( resolves   #4352 ) [ci skip]  
						
						
						
					 
					
						2019-10-01 21:59:50 +02:00 
						 
				 
			
				
					
						
							
							
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							667f294627 
							
						 
					 
					
						
						
							
							Merge branch 'master' of  https://github.com/explosion/spaCy  
						
						
						
					 
					
						2019-10-01 21:37:25 +02:00 
						 
				 
			
				
					
						
							
							
								Ines Montani 
							
						 
					 
					
						
						
						
						
							
						
						
							0dd127bb00 
							
						 
					 
					
						
						
							
							Update v2-2.md [ci skip]  
						
						
						
					 
					
						2019-10-01 21:37:06 +02:00