Orion Montoya 
							
						 
					 
					
						
						
						
						
							
						
						
							e81a608173 
							
						 
					 
					
						
						
							
							Regression test for lemmatizer exceptions -- demonstrate issue  #1387  
						
						
						
					 
					
						2017-10-05 10:47:48 -04:00 
						 
				 
			
				
					
						
							
							
								Wannaphong Phatthiyaphaibun 
							
						 
					 
					
						
						
						
						
							
						
						
							1abf472068 
							
						 
					 
					
						
						
							
							add th test  
						
						
						
					 
					
						2017-09-21 12:56:58 +07:00 
						 
				 
			
				
					
						
							
							
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							ddaff6ca56 
							
						 
					 
					
						
						
							
							Merge pull request  #1287  from IamJeffG/feature/1226-more-complete-noun-chunks  
						
						... 
						
						
						
						Capture more noun chunks 
						
					 
					
						2017-09-08 07:59:10 +02:00 
						 
				 
			
				
					
						
							
							
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							45029a550e 
							
						 
					 
					
						
						
							
							Fix customized-tokenizer tests  
						
						
						
					 
					
						2017-09-04 20:13:13 +02:00 
						 
				 
			
				
					
						
							
							
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							34c585396a 
							
						 
					 
					
						
						
							
							Merge pull request  #1294  from Vimos/master  
						
						... 
						
						
						
						Fix issue #1292  and add test case for the Assertion Error 
						
					 
					
						2017-09-04 19:20:40 +02:00 
						 
				 
			
				
					
						
							
							
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							c68f188eb0 
							
						 
					 
					
						
						
							
							Fix error on test  
						
						
						
					 
					
						2017-09-04 18:59:36 +02:00 
						 
				 
			
				
					
						
							
							
								Eric Zhao 
							
						 
					 
					
						
						
						
						
							
						
						
							d61c117081 
							
						 
					 
					
						
						
							
							Lowest common ancestor matrix for spans and docs  
						
						... 
						
						
						
						Added functionality for spans and docs to get lowest common ancestor
matrix by simply calling: doc.get_lca_matrix() or
doc[:3].get_lca_matrix().
Corresponding unit tests were also added under spacy/tests/doc and
spacy/tests/spans.
Designed to address: https://github.com/explosion/spaCy/issues/969 . 
						
					 
					
						2017-09-03 12:22:19 -07:00 
						 
				 
			
				
					
						
							
							
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							9bffcaa73d 
							
						 
					 
					
						
						
							
							Update test to make it slightly more direct  
						
						... 
						
						
						
						The `nlp` container should be unnecessary here. If so, we can test the tokenizer class just a little more directly. 
						
					 
					
						2017-09-01 21:16:56 +02:00 
						 
				 
			
				
					
						
							
							
								Vimos Tan 
							
						 
					 
					
						
						
						
						
							
						
						
							a6d9fb5bb6 
							
						 
					 
					
						
						
							
							fix issue  #1292  
						
						
						
					 
					
						2017-08-30 14:49:14 +08:00 
						 
				 
			
				
					
						
							
							
								Jeffrey Gerard 
							
						 
					 
					
						
						
						
						
							
						
						
							884ba168a8 
							
						 
					 
					
						
						
							
							Capture more noun chunks  
						
						
						
					 
					
						2017-08-23 21:18:53 -07:00 
						 
				 
			
				
					
						
							
							
								ines 
							
						 
					 
					
						
						
						
						
							
						
						
							dcff10abe9 
							
						 
					 
					
						
						
							
							Add regression test for  #1281  
						
						
						
					 
					
						2017-08-21 16:11:47 +02:00 
						 
				 
			
				
					
						
							
							
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							796b2f4c1b 
							
						 
					 
					
						
						
							
							Remove print statements in tests  
						
						
						
					 
					
						2017-07-22 15:42:38 +02:00 
						 
				 
			
				
					
						
							
							
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							4b2e5e59ed 
							
						 
					 
					
						
						
							
							Add flush_cache method to tokenizer, to  fix   #1061  
						
						... 
						
						
						
						The tokenizer caches output for common chunks, for efficiency. This
cache is be invalidated when the tokenizer rules change, e.g. when a new
special-case rule is introduced. That's what was causing #1061 .
When the cache is flushed, we free the intermediate token chunks.
I *think* this is safe --- but if we start getting segfaults, this patch
is to blame. The resolution would be to simply not free those bits of
memory. They'll be freed when the tokenizer exits anyway. 
						
					 
					
						2017-07-22 15:06:50 +02:00 
						 
				 
			
				
					
						
							
							
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							d9b85675d7 
							
						 
					 
					
						
						
							
							Rename regression test  
						
						
						
					 
					
						2017-07-22 14:14:35 +02:00 
						 
				 
			
				
					
						
							
							
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							dfbc7e49de 
							
						 
					 
					
						
						
							
							Add test for Issue  #1207  
						
						
						
					 
					
						2017-07-22 14:14:01 +02:00 
						 
				 
			
				
					
						
							
							
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							0ae3807d7d 
							
						 
					 
					
						
						
							
							Fix gaps in Lexeme API.  Closes   #1031  
						
						
						
					 
					
						2017-07-22 13:53:48 +02:00 
						 
				 
			
				
					
						
							
							
								Paul O'Leary McCann 
							
						 
					 
					
						
						
						
						
							
						
						
							bc87b815cc 
							
						 
					 
					
						
						
							
							Add comment clarifying what LANGUAGES does  
						
						
						
					 
					
						2017-07-09 16:28:55 +09:00 
						 
				 
			
				
					
						
							
							
								Paul O'Leary McCann 
							
						 
					 
					
						
						
						
						
							
						
						
							04e6a65188 
							
						 
					 
					
						
						
							
							Remove Japanese from LANGUAGES  
						
						... 
						
						
						
						LANGUAGES is a list of languages whose tokenizers get run through a
variety of generic tests. Since the generic tests don't check the JA
fixture, it blows up when it can't find janome. -POLM 
						
					 
					
						2017-07-09 16:23:26 +09:00 
						 
				 
			
				
					
						
							
							
								Paul O'Leary McCann 
							
						 
					 
					
						
						
						
						
							
						
						
							c336193392 
							
						 
					 
					
						
						
							
							Parametrize and extend Japanese tokenizer tests  
						
						
						
					 
					
						2017-06-29 00:09:40 +09:00 
						 
				 
			
				
					
						
							
							
								Paul O'Leary McCann 
							
						 
					 
					
						
						
						
						
							
						
						
							30a34ebb6e 
							
						 
					 
					
						
						
							
							Add importorskip for janome  
						
						
						
					 
					
						2017-06-29 00:09:20 +09:00 
						 
				 
			
				
					
						
							
							
								Paul O'Leary McCann 
							
						 
					 
					
						
						
						
						
							
						
						
							e56fea14eb 
							
						 
					 
					
						
						
							
							Add basic Japanese tokenizer test  
						
						
						
					 
					
						2017-06-28 01:24:25 +09:00 
						 
				 
			
				
					
						
							
							
								ines 
							
						 
					 
					
						
						
						
						
							
						
						
							6e1dbc608e 
							
						 
					 
					
						
						
							
							Fix parse_tree test  
						
						
						
					 
					
						2017-05-13 12:34:20 +02:00 
						 
				 
			
				
					
						
							
							
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							ad590feaa8 
							
						 
					 
					
						
						
							
							Fix test, which imported English incorrectly  
						
						
						
					 
					
						2017-05-13 11:36:19 +02:00 
						 
				 
			
				
					
						
							
							
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							b2540d2379 
							
						 
					 
					
						
						
							
							Merge Kengz's tree_print patch  
						
						
						
					 
					
						2017-05-13 03:18:49 +02:00 
						 
				 
			
				
					
						
							
							
								Ines Montani 
							
						 
					 
					
						
						
						
						
							
						
						
							7da9cefd25 
							
						 
					 
					
						
						
							
							Merge pull request  #1022  from luvogels/master  
						
						... 
						
						
						
						Initial support for Norwegian Bokmål 
						
					 
					
						2017-04-27 11:16:06 +02:00 
						 
				 
			
				
					
						
							
							
								luvogels 
							
						 
					 
					
						
						
						
						
							
						
						
							d12a0b6431 
							
						 
					 
					
						
						
							
							Hooked up tokenizer tests  
						
						
						
					 
					
						2017-04-26 23:21:41 +02:00 
						 
				 
			
				
					
						
							
							
								luvogels 
							
						 
					 
					
						
						
						
						
							
						
						
							8de59ce3b9 
							
						 
					 
					
						
						
							
							Added tokenizer tests  
						
						
						
					 
					
						2017-04-26 19:10:18 +02:00 
						 
				 
			
				
					
						
							
							
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							4d98511db7 
							
						 
					 
					
						
						
							
							Make Span hashable.  Closes   #1019  
						
						
						
					 
					
						2017-04-26 19:01:05 +02:00 
						 
				 
			
				
					
						
							
							
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							24c4c51f13 
							
						 
					 
					
						
						
							
							Try to make test999 less flakey  
						
						
						
					 
					
						2017-04-26 18:42:06 +02:00 
						 
				 
			
				
					
						
							
							
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							c4be9c36fe 
							
						 
					 
					
						
						
							
							Fix unicode header in tests  
						
						
						
					 
					
						2017-04-24 10:09:01 +02:00 
						 
				 
			
				
					
						
							
							
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							65f10b53e5 
							
						 
					 
					
						
						
							
							Fix test  
						
						
						
					 
					
						2017-04-24 00:25:55 +02:00 
						 
				 
			
				
					
						
							
							
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							70a43858e1 
							
						 
					 
					
						
						
							
							Fix flakey test  
						
						
						
					 
					
						2017-04-24 00:06:30 +02:00 
						 
				 
			
				
					
						
							
							
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							3973af2d15 
							
						 
					 
					
						
						
							
							Make training test less flakey  
						
						
						
					 
					
						2017-04-23 22:59:34 +02:00 
						 
				 
			
				
					
						
							
							
								ines 
							
						 
					 
					
						
						
						
						
							
						
						
							42305bc519 
							
						 
					 
					
						
						
							
							Remove unnecessary test  
						
						
						
					 
					
						2017-04-23 21:21:41 +02:00 
						 
				 
			
				
					
						
							
							
								ines 
							
						 
					 
					
						
						
						
						
							
						
						
							012ea594d1 
							
						 
					 
					
						
						
							
							Add file for misc tests  
						
						
						
					 
					
						2017-04-23 21:06:51 +02:00 
						 
				 
			
				
					
						
							
							
								ines 
							
						 
					 
					
						
						
						
						
							
						
						
							83f66947dc 
							
						 
					 
					
						
						
							
							Rename test_download to test_cli  
						
						
						
					 
					
						2017-04-23 21:06:50 +02:00 
						 
				 
			
				
					
						
							
							
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							874a3cbb07 
							
						 
					 
					
						
						
							
							Add test for Issue  #955  
						
						
						
					 
					
						2017-04-23 17:57:01 +02:00 
						 
				 
			
				
					
						
							
							
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							5d8af40445 
							
						 
					 
					
						
						
							
							Add test for Issue  #999  
						
						
						
					 
					
						2017-04-23 17:06:30 +02:00 
						 
				 
			
				
					
						
							
							
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							040751ad17 
							
						 
					 
					
						
						
							
							Remove xfail on Test  #910  
						
						
						
					 
					
						2017-04-23 16:28:55 +02:00 
						 
				 
			
				
					
						
							
							
								Ben Eyal 
							
						 
					 
					
						
						
						
						
							
						
						
							e90e8a3f10 
							
						 
					 
					
						
						
							
							Enable test  
						
						
						
					 
					
						2017-04-20 02:25:24 +03:00 
						 
				 
			
				
					
						
							
							
								ines 
							
						 
					 
					
						
						
						
						
							
						
						
							2bd89e7ade 
							
						 
					 
					
						
						
							
							Tidy up Hebrew tests and test for punctuation (see  #995 )  
						
						
						
					 
					
						2017-04-19 19:28:03 +02:00 
						 
				 
			
				
					
						
							
							
								ines 
							
						 
					 
					
						
						
						
						
							
						
						
							13d30b6c01 
							
						 
					 
					
						
						
							
							xfail lemmatizer test that's causing problems (see  #546 )  
						
						
						
					 
					
						2017-04-16 21:18:39 +02:00 
						 
				 
			
				
					
						
							
							
								ines 
							
						 
					 
					
						
						
						
						
							
						
						
							0084466a66 
							
						 
					 
					
						
						
							
							Remove unused utf8open util and replace os.path with ensure_path  
						
						
						
					 
					
						2017-04-16 20:37:45 +02:00 
						 
				 
			
				
					
						
							
							
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							1dca7eeb03 
							
						 
					 
					
						
						
							
							Add unicode declaration on new regression test  
						
						
						
					 
					
						2017-04-07 18:09:23 +02:00 
						 
				 
			
				
					
						
							
							
								ines 
							
						 
					 
					
						
						
						
						
							
						
						
							887827fc6a 
							
						 
					 
					
						
						
							
							Merge branch 'develop'  
						
						
						
					 
					
						2017-04-07 17:36:23 +02:00 
						 
				 
			
				
					
						
							
							
								ines 
							
						 
					 
					
						
						
						
						
							
						
						
							444dd511c5 
							
						 
					 
					
						
						
							
							Fix xpassing URL test case  
						
						
						
					 
					
						2017-04-07 17:36:05 +02:00 
						 
				 
			
				
					
						
							
							
								ines 
							
						 
					 
					
						
						
						
						
							
						
						
							bf0f15e762 
							
						 
					 
					
						
						
							
							Add / to tokenizer infixes ( resolves   #891 )  
						
						
						
					 
					
						2017-04-07 17:30:44 +02:00 
						 
				 
			
				
					
						
							
							
								ines 
							
						 
					 
					
						
						
						
						
							
						
						
							00b9011a49 
							
						 
					 
					
						
						
							
							Fix whitespace  
						
						
						
					 
					
						2017-04-07 17:29:59 +02:00 
						 
				 
			
				
					
						
							
							
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							0513c43bf0 
							
						 
					 
					
						
						
							
							Merge branch 'master' of  https://github.com/explosion/spaCy  
						
						
						
					 
					
						2017-04-07 17:07:10 +02:00 
						 
				 
			
				
					
						
							
							
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							cc36c308f4 
							
						 
					 
					
						
						
							
							Fix noun_chunk rules around coordination  
						
						... 
						
						
						
						Closes  #693 . 
					
						2017-04-07 17:06:40 +02:00