Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							59c216196c 
							
						 
					 
					
						
						
							
							Allow weakrefs on Doc objects  
						
						
						
					 
					
						2017-10-16 19:22:11 +02:00 
						 
				 
			
				
					
						
							
							
								ines 
							
						 
					 
					
						
						
						
						
							
						
						
							d5418553eb 
							
						 
					 
					
						
						
							
							Fix whitespace  
						
						
						
					 
					
						2017-10-16 18:30:04 +02:00 
						 
				 
			
				
					
						
							
							
								ines 
							
						 
					 
					
						
						
						
						
							
						
						
							6ceadcdb5c 
							
						 
					 
					
						
						
							
							Make sure from_disk passes string to numpy (see  #1421 )  
						
						... 
						
						
						
						If path is a WindowsPath, numpy does not recognise it as a path and as
a result, doesn't open the file.
https://github.com/numpy/numpy/blob/master/numpy/lib/npyio.py#L369  
						
					 
					
						2017-10-16 18:29:56 +02:00 
						 
				 
			
				
					
						
							
							
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							010a7309ff 
							
						 
					 
					
						
						
							
							Merge pull request  #1402  from explosion/feature/fix-matcher-operators  
						
						... 
						
						
						
						💫  Fix Matcher variable-length operators 
					
						2017-10-16 17:53:19 +02:00 
						 
				 
			
				
					
						
							
							
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							c29927d2e7 
							
						 
					 
					
						
						
							
							Fix matcher test  
						
						
						
					 
					
						2017-10-16 17:22:18 +02:00 
						 
				 
			
				
					
						
							
							
								Vishnu Kumar Nekkanti 
							
						 
					 
					
						
						
						
						
							
						
						
							d3c54cf39a 
							
						 
					 
					
						
						
							
							fixed SyntaxError while checking for jieba  
						
						
						
					 
					
						2017-10-16 18:51:33 +05:30 
						 
				 
			
				
					
						
							
							
								Vishnu Kumar Nekkanti 
							
						 
					 
					
						
						
						
						
							
						
						
							18ec6610dd 
							
						 
					 
					
						
						
							
							Merge pull request  #1  from explosion/develop  
						
						... 
						
						
						
						Develop 
						
					 
					
						2017-10-16 18:34:13 +05:30 
						 
				 
			
				
					
						
							
							
								ines 
							
						 
					 
					
						
						
						
						
							
						
						
							63393b4e0d 
							
						 
					 
					
						
						
							
							Update matcher docs to reflect operator changes  
						
						
						
					 
					
						2017-10-16 13:44:12 +02:00 
						 
				 
			
				
					
						
							
							
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							a928ae2f35 
							
						 
					 
					
						
						
							
							Merge branch 'develop' into feature/fix-matcher-operators  
						
						
						
					 
					
						2017-10-16 13:38:36 +02:00 
						 
				 
			
				
					
						
							
							
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							56aa42cc5d 
							
						 
					 
					
						
						
							
							Fix and document matcher operator 'shadowing' behaviour  
						
						
						
					 
					
						2017-10-16 13:38:20 +02:00 
						 
				 
			
				
					
						
							
							
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							748d525801 
							
						 
					 
					
						
						
							
							Add more matcher operator tests  
						
						
						
					 
					
						2017-10-16 13:38:01 +02:00 
						 
				 
			
				
					
						
							
							
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							0433181658 
							
						 
					 
					
						
						
							
							Document operator semantics in Matcher docstring  
						
						
						
					 
					
						2017-10-16 12:06:33 +02:00 
						 
				 
			
				
					
						
							
							
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							cd9378c8f1 
							
						 
					 
					
						
						
							
							Merge pull request  #1423  from yuukos/master  
						
						... 
						
						
						
						Fixed Russian tokenizer 
						
					 
					
						2017-10-16 11:45:53 +02:00 
						 
				 
			
				
					
						
							
							
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							6b0121091c 
							
						 
					 
					
						
						
							
							Merge pull request  #1420  from polm/master  
						
						... 
						
						
						
						[ja] Stash tokenizer output for speed 
						
					 
					
						2017-10-16 10:28:22 +02:00 
						 
				 
			
				
					
						
							
							
								yuukos 
							
						 
					 
					
						
						
						
						
							
						
						
							34e9c6ddc0 
							
						 
					 
					
						
						
							
							Merge remote-tracking branch 'origin/master'  
						
						
						
					 
					
						2017-10-16 13:48:10 +07:00 
						 
				 
			
				
					
						
							
							
								yuukos 
							
						 
					 
					
						
						
						
						
							
						
						
							92931a2efd 
							
						 
					 
					
						
						
							
							Merge branch 'russian_language'  
						
						
						
					 
					
						2017-10-16 13:46:28 +07:00 
						 
				 
			
				
					
						
							
							
								yuukos 
							
						 
					 
					
						
						
						
						
							
						
						
							241d19a3e6 
							
						 
					 
					
						
						
							
							fixed Russian Tokenizer  
						
						... 
						
						
						
						- added trailing space flags for tokens 
						
					 
					
						2017-10-16 13:37:05 +07:00 
						 
				 
			
				
					
						
							
							
								Paul O'Leary McCann 
							
						 
					 
					
						
						
						
						
							
						
						
							71ae8013ec 
							
						 
					 
					
						
						
							
							[ja] Use user_details instead of a wrapper class  
						
						... 
						
						
						
						Instead of using a JapaneseDoc wrapper class to store Mecab output,
stash it in `user_data`. -POLM 
						
					 
					
						2017-10-16 00:24:34 +09:00 
						 
				 
			
				
					
						
							
							
								Paul O'Leary McCann 
							
						 
					 
					
						
						
						
						
							
						
						
							43eedf73f2 
							
						 
					 
					
						
						
							
							[ja] Stash tokenizer output for speed  
						
						... 
						
						
						
						Before this commit, the Mecab tokenizer had to be called twice when
creating a Doc- once during tokenization and once during tagging. This
creates a JapaneseDoc wrapper class for Doc that stashes the parsed
tokenizer output to remove redundant processing. -POLM 
						
					 
					
						2017-10-15 23:33:25 +09:00 
						 
				 
			
				
					
						
							
							
								ines 
							
						 
					 
					
						
						
						
						
							
						
						
							15514dc333 
							
						 
					 
					
						
						
							
							Add section on upgrading  
						
						
						
					 
					
						2017-10-14 22:14:47 +02:00 
						 
				 
			
				
					
						
							
							
								ines 
							
						 
					 
					
						
						
						
						
							
						
						
							c0aceb9fbe 
							
						 
					 
					
						
						
							
							Add Hindi to supported languages  
						
						
						
					 
					
						2017-10-14 15:16:41 +02:00 
						 
				 
			
				
					
						
							
							
								Ines Montani 
							
						 
					 
					
						
						
						
						
							
						
						
							e00a6c08cf 
							
						 
					 
					
						
						
							
							Merge pull request  #1418  from polm/master  
						
						... 
						
						
						
						Contributor agreement 
						
					 
					
						2017-10-14 15:10:58 +02:00 
						 
				 
			
				
					
						
							
							
								ines 
							
						 
					 
					
						
						
						
						
							
						
						
							266e7180a7 
							
						 
					 
					
						
						
							
							Add Language class, stop words and basic stemmer that sets NORM  
						
						
						
					 
					
						2017-10-14 14:59:52 +02:00 
						 
				 
			
				
					
						
							
							
								ines 
							
						 
					 
					
						
						
						
						
							
						
						
							e85e1d571b 
							
						 
					 
					
						
						
							
							Update base punctuation  
						
						
						
					 
					
						2017-10-14 14:59:23 +02:00 
						 
				 
			
				
					
						
							
							
								ines 
							
						 
					 
					
						
						
						
						
							
						
						
							9d6c8eaa49 
							
						 
					 
					
						
						
							
							Update base norm exceptions with more unicode characters  
						
						... 
						
						
						
						e.g. unicode variations of punctuation used in Chinese 
						
					 
					
						2017-10-14 14:58:52 +02:00 
						 
				 
			
				
					
						
							
							
								ines 
							
						 
					 
					
						
						
						
						
							
						
						
							3516aa0cea 
							
						 
					 
					
						
						
							
							Port over changes from  #1389  
						
						
						
					 
					
						2017-10-14 13:32:55 +02:00 
						 
				 
			
				
					
						
							
							
								ines 
							
						 
					 
					
						
						
						
						
							
						
						
							cd6a29dce7 
							
						 
					 
					
						
						
							
							Port over changes from  #1294  
						
						
						
					 
					
						2017-10-14 13:28:46 +02:00 
						 
				 
			
				
					
						
							
							
								ines 
							
						 
					 
					
						
						
						
						
							
						
						
							38c756fd85 
							
						 
					 
					
						
						
							
							Port over changes from  #1287  
						
						
						
					 
					
						2017-10-14 13:16:21 +02:00 
						 
				 
			
				
					
						
							
							
								ines 
							
						 
					 
					
						
						
						
						
							
						
						
							612224c10d 
							
						 
					 
					
						
						
							
							Port over changes from  #1157  
						
						
						
					 
					
						2017-10-14 13:11:39 +02:00 
						 
				 
			
				
					
						
							
							
								ines 
							
						 
					 
					
						
						
						
						
							
						
						
							9b3f8f9ec3 
							
						 
					 
					
						
						
							
							Fix formatting and add comment on languages  
						
						
						
					 
					
						2017-10-14 13:11:18 +02:00 
						 
				 
			
				
					
						
							
							
								ines 
							
						 
					 
					
						
						
						
						
							
						
						
							a4d974d97b 
							
						 
					 
					
						
						
							
							Port over URL pattern changes from  #1411  
						
						
						
					 
					
						2017-10-14 12:58:07 +02:00 
						 
				 
			
				
					
						
							
							
								ines 
							
						 
					 
					
						
						
						
						
							
						
						
							09aed58140 
							
						 
					 
					
						
						
							
							Port over changes from  #1333  and add comments  
						
						
						
					 
					
						2017-10-14 12:52:59 +02:00 
						 
				 
			
				
					
						
							
							
								ines 
							
						 
					 
					
						
						
						
						
							
						
						
							a5da683578 
							
						 
					 
					
						
						
							
							Add Russian to alpha docs and update tokenizer dependencies  
						
						
						
					 
					
						2017-10-14 12:52:41 +02:00 
						 
				 
			
				
					
						
							
							
								ines 
							
						 
					 
					
						
						
						
						
							
						
						
							a69f4e56e5 
							
						 
					 
					
						
						
							
							Remove outdated aside  
						
						
						
					 
					
						2017-10-14 12:52:07 +02:00 
						 
				 
			
				
					
						
							
							
								ines 
							
						 
					 
					
						
						
						
						
							
						
						
							bb6ecb82e5 
							
						 
					 
					
						
						
							
							Ensure long file paths in code examples break if needed  
						
						
						
					 
					
						2017-10-14 12:51:52 +02:00 
						 
				 
			
				
					
						
							
							
								Paul O'Leary McCann 
							
						 
					 
					
						
						
						
						
							
						
						
							a31d33be06 
							
						 
					 
					
						
						
							
							Contributor agreement  
						
						
						
					 
					
						2017-10-14 19:28:04 +09:00 
						 
				 
			
				
					
						
							
							
								Ines Montani 
							
						 
					 
					
						
						
						
						
							
						
						
							4b5af8bd17 
							
						 
					 
					
						
						
							
							Merge pull request  #1414  from yuukos/master  
						
						... 
						
						
						
						Adding Russian language support 
						
					 
					
						2017-10-13 17:03:52 +02:00 
						 
				 
			
				
					
						
							
							
								Alex 
							
						 
					 
					
						
						
						
						
							
						
						
							95836abee1 
							
						 
					 
					
						
						
							
							Update CONTRIBUTORS.md  
						
						
						
					 
					
						2017-10-13 21:02:19 +07:00 
						 
				 
			
				
					
						
							
							
								Alex 
							
						 
					 
					
						
						
						
						
							
						
						
							ce00405afc 
							
						 
					 
					
						
						
							
							Create yuukos.md  
						
						
						
					 
					
						2017-10-13 21:00:15 +07:00 
						 
				 
			
				
					
						
							
							
								yuukos 
							
						 
					 
					
						
						
						
						
							
						
						
							6fb9d75bd2 
							
						 
					 
					
						
						
							
							fixed test with creating tokenizer  
						
						
						
					 
					
						2017-10-13 15:51:03 +07:00 
						 
				 
			
				
					
						
							
							
								yuukos 
							
						 
					 
					
						
						
						
						
							
						
						
							a229b6e0de 
							
						 
					 
					
						
						
							
							added tests for Russian language  
						
						... 
						
						
						
						added tests of creating Russian Language instance and Russian tokenizer 
						
					 
					
						2017-10-13 14:04:37 +07:00 
						 
				 
			
				
					
						
							
							
								yuukos 
							
						 
					 
					
						
						
						
						
							
						
						
							622b6d6270 
							
						 
					 
					
						
						
							
							updated Russian tokenizer  
						
						... 
						
						
						
						moved the trying to import pymorph into __init__ 
						
					 
					
						2017-10-13 13:57:29 +07:00 
						 
				 
			
				
					
						
							
							
								ines 
							
						 
					 
					
						
						
						
						
							
						
						
							bfd9506f1d 
							
						 
					 
					
						
						
							
							Update extensions docs and add resources  
						
						
						
					 
					
						2017-10-13 00:18:13 +02:00 
						 
				 
			
				
					
						
							
							
								ines 
							
						 
					 
					
						
						
						
						
							
						
						
							5f5d6897e8 
							
						 
					 
					
						
						
							
							Increment version  
						
						
						
					 
					
						2017-10-13 00:18:02 +02:00 
						 
				 
			
				
					
						
							
							
								ines 
							
						 
					 
					
						
						
						
						
							
						
						
							9fd68334ab 
							
						 
					 
					
						
						
							
							Add validate command docs  
						
						
						
					 
					
						2017-10-12 23:36:48 +02:00 
						 
				 
			
				
					
						
							
							
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							cf6da9301a 
							
						 
					 
					
						
						
							
							Update lemmatizer test  
						
						
						
					 
					
						2017-10-12 22:50:52 +02:00 
						 
				 
			
				
					
						
							
							
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							9b90d235d1 
							
						 
					 
					
						
						
							
							Fix tag check in lemmatizer  
						
						
						
					 
					
						2017-10-12 22:50:43 +02:00 
						 
				 
			
				
					
						
							
							
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							dc01acd821 
							
						 
					 
					
						
						
							
							Escape encoding in validate function  
						
						
						
					 
					
						2017-10-12 22:23:21 +02:00 
						 
				 
			
				
					
						
							
							
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							27b927259a 
							
						 
					 
					
						
						
							
							Add locale_escape compat function  
						
						
						
					 
					
						2017-10-12 22:22:04 +02:00 
						 
				 
			
				
					
						
							
							
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							e72603f39f 
							
						 
					 
					
						
						
							
							Merge pull request  #1416  from explosion/feature/cli-validate  
						
						... 
						
						
						
						💫  Add "validate" command to CLI 
					
						2017-10-12 21:45:20 +02:00