Ioannis Daras 
							
						 
					 
					
						
						
						
						
							
						
						
							6ed18412d0 
							
						 
					 
					
						
						
							
							Greek language optimizations ( #2558 )  
						
						 
						
						... 
						
						
						
						* Greek language optimizations
* Add encoding on files containing greek words
* Add encoding on files containing greek words 
						
					 
					
						2018-07-18 18:51:38 +02:00  
					
					
						 
						
							
							
							 
						
					 
				 
			
				
					
						
							
							
								 
								Aliia E 
							
						 
					 
					
						
						
						
						
							
						
						
							428bae66b5 
							
						 
					 
					
						
						
							
							Add Tatar Language Support ( #2444 )  
						
						 
						
						... 
						
						
						
						* add Tatar lang support
* add Tatar letters
* add Tatar tests
* sign contributor agreement
* sign contributor agreement [x]
* remove comments from Language class
* remove all template comments 
						
					 
					
						2018-06-19 10:17:53 +02:00  
					
					
						 
						
							
							
							 
						
					 
				 
			
				
					
						
							
							
								 
								Tahar Zanouda 
							
						 
					 
					
						
						
						
						
							
						
						
							00417794d3 
							
						 
					 
					
						
						
							
							Add Arabic language ( #2314 )  
						
						 
						
						... 
						
						
						
						* added support for Arabic lang
* added Arabic language support
* updated conftest 
						
					 
					
						2018-05-15 00:27:19 +02:00  
					
					
						 
						
							
							
							 
						
					 
				 
			
				
					
						
							
							
								 
								Ali Zarezade 
							
						 
					 
					
						
						
							
							
						
						
						
							
						
						
							42349471bc 
							
						 
					 
					
						
						
							
							add ٪ as punctuation  
						
						 
						
						
						
					 
					
						2018-01-23 18:11:33 +03:30  
					
					
						 
						
							
							
							 
						
					 
				 
			
				
					
						
							
							
								 
								Ali Zarezade 
							
						 
					 
					
						
						
							
							
						
						
						
							
						
						
							2bda582135 
							
						 
					 
					
						
						
							
							Add Persian character and symbols  
						
						 
						
						... 
						
						
						
						Add Persian characters and the following:
- ٪ used instead of %
- ؟ used instead of ?
- ﷼ used instead of $
- ، used instead of ,
- ؛ used instead of ; 
						
					 
					
						2018-01-23 13:20:36 +03:30  
					
					
						 
						
							
							
							 
						
					 
				 
			
				
					
						
							
							
								 
								Vadim Mazaev 
							
						 
					 
					
						
						
						
						
							
						
						
							81314f8659 
							
						 
					 
					
						
						
							
							Fixed tokenizer: added char classes; added first lemmatizer and  
						
						 
						
						... 
						
						
						
						tokenizer tests 
						
					 
					
						2017-11-21 22:23:59 +03:00  
					
					
						 
						
							
							
							 
						
					 
				 
			
				
					
						
							
							
								 
								ines 
							
						 
					 
					
						
						
						
						
							
						
						
							e85e1d571b 
							
						 
					 
					
						
						
							
							Update base punctuation  
						
						 
						
						
						
					 
					
						2017-10-14 14:59:23 +02:00  
					
					
						 
						
							
							
							 
						
					 
				 
			
				
					
						
							
							
								 
								ines 
							
						 
					 
					
						
						
						
						
							
						
						
							09aed58140 
							
						 
					 
					
						
						
							
							Port over changes from  #1333  and add comments  
						
						 
						
						
						
					 
					
						2017-10-14 12:52:59 +02:00  
					
					
						 
						
							
							
							 
						
					 
				 
			
				
					
						
							
							
								 
								ines 
							
						 
					 
					
						
						
						
						
							
						
						
							5ee10379db 
							
						 
					 
					
						
						
							
							Port over changes from  #1340  
						
						 
						
						
						
					 
					
						2017-09-26 16:38:08 +02:00  
					
					
						 
						
							
							
							 
						
					 
				 
			
				
					
						
							
							
								 
								ines 
							
						 
					 
					
						
						
						
						
							
						
						
							10d291f129 
							
						 
					 
					
						
						
							
							Port over change from  #1351  
						
						 
						
						
						
					 
					
						2017-09-26 16:11:41 +02:00  
					
					
						 
						
							
							
							 
						
					 
				 
			
				
					
						
							
							
								 
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							cfc055734e 
							
						 
					 
					
						
						
							
							Split % in units, for compatibility with corpus  
						
						 
						
						
						
					 
					
						2017-08-25 20:03:37 -05:00  
					
					
						 
						
							
							
							 
						
					 
				 
			
				
					
						
							
							
								 
								ines 
							
						 
					 
					
						
						
						
						
							
						
						
							a8e58e04ef 
							
						 
					 
					
						
						
							
							Add symbols class to punctuation rules to handle emoji (see  #1088 )  
						
						 
						
						... 
						
						
						
						Currently doesn't work for Hungarian, because of conflicts with the
custom punctuation rules. Also doesn't take multi-character emoji like
👩🏽💻  into account. 
						
					 
					
						2017-05-27 17:57:10 +02:00  
					
					
						 
						
							
							
							 
						
					 
				 
			
				
					
						
							
							
								 
								ines 
							
						 
					 
					
						
						
						
						
							
						
						
							604f299cf6 
							
						 
					 
					
						
						
							
							Add char classes to global language data  
						
						 
						
						
						
					 
					
						2017-05-08 23:59:33 +02:00