Adriane Boyd 
							
						 
					 
					
						
						
							
							
						
						
						
							
						
						
							e908a67829 
							
						 
					 
					
						
						
							
							Handle unknown tags in KoreanTokenizer tag map ( #10536 )  
						
						
						
					 
					
						2022-03-24 11:25:36 +01:00 
						 
				 
			
				
					
						
							
							
								Adriane Boyd 
							
						 
					 
					
						
						
							
							
						
						
						
							
						
						
							30030176ee 
							
						 
					 
					
						
						
							
							Update Korean defaults for Tokenizer ( #10322 )  
						
						... 
						
						
						
						Update Korean defaults for `Tokenizer` for tokenization following UD
Korean Kaist. 
						
					 
					
						2022-02-21 10:26:19 +01:00 
						 
				 
			
				
					
						
							
							
								Adriane Boyd 
							
						 
					 
					
						
						
							
							
						
						
						
							
						
						
							c5de9b463a 
							
						 
					 
					
						
						
							
							Update custom tokenizer APIs and pickling ( #8972 )  
						
						... 
						
						
						
						* Fix incorrect pickling of Japanese and Korean pipelines, which led to
the entire pipeline being reset if pickled
* Enable pickling of Vietnamese tokenizer
* Update tokenizer APIs for Chinese, Japanese, Korean, Thai, and
Vietnamese so that only the `Vocab` is required for initialization 
						
					 
					
						2021-08-19 14:37:47 +02:00 
						 
				 
			
				
					
						
							
							
								Ines Montani 
							
						 
					 
					
						
						
							
							
						
						
						
							
						
						
							db55577c45 
							
						 
					 
					
						
						
							
							Drop Python 2.7 and 3.5 ( #4828 )  
						
						... 
						
						
						
						* Remove unicode declarations
* Remove Python 3.5 and 2.7 from CI
* Don't require pathlib
* Replace compat helpers
* Remove OrderedDict
* Use f-strings
* Set Cython compiler language level
* Fix typo
* Re-add OrderedDict for Table
* Update setup.cfg
* Revert CONTRIBUTING.md
* Revert lookups.md
* Revert top-level.md
* Small adjustments and docs [ci skip] 
						
					 
					
						2019-12-22 01:53:56 +01:00 
						 
				 
			
				
					
						
							
							
								Ines Montani 
							
						 
					 
					
						
						
						
						
							
						
						
							3d8fd4b461 
							
						 
					 
					
						
						
							
							Revert  #4334  
						
						
						
					 
					
						2019-09-29 17:32:12 +02:00 
						 
				 
			
				
					
						
							
							
								Ines Montani 
							
						 
					 
					
						
						
						
						
							
						
						
							c9cd516d96 
							
						 
					 
					
						
						
							
							Move tests out of package ( #4334 )  
						
						... 
						
						
						
						* Move tests out of package
* Fix typo 
						
					 
					
						2019-09-28 18:05:00 +02:00 
						 
				 
			
				
					
						
							
							
								Bae Yong-Ju 
							
						 
					 
					
						
						
						
						
							
						
						
							a55f5a744f 
							
						 
					 
					
						
						
							
							Fix ValueError exception on empty Korean text. ( #4245 )  
						
						
						
					 
					
						2019-09-06 10:29:40 +02:00 
						 
				 
			
				
					
						
							
							
								Bae Yong-Ju 
							
						 
					 
					
						
						
						
						
							
						
						
							05fbf5d976 
							
						 
					 
					
						
						
							
							Fix error when Korean text contains regexp special characters. ( #4022 )  
						
						
						
					 
					
						2019-07-25 17:53:33 +02:00 
						 
				 
			
				
					
						
							
							
								Ines Montani 
							
						 
					 
					
						
						
						
						
							
						
						
							0b8406a05c 
							
						 
					 
					
						
						
							
							Tidy up and auto-format  
						
						
						
					 
					
						2019-07-11 12:02:25 +02:00 
						 
				 
			
				
					
						
							
							
								cedar101 
							
						 
					 
					
						
						
						
						
							
						
						
							58f06e6180 
							
						 
					 
					
						
						
							
							Korean support ( #3901 )  
						
						... 
						
						
						
						* start lang/ko
* add test codes
* using natto-py
* add test_ko_tokenizer_full_tags()
* spaCy contributor agreement
* external dependency for ko
* collections.namedtuple for python version < 3.5
* case fix
* tuple unpacking
* add jongseong(final consonant)
* apply mecab option
* Remove Pipfile for now
Co-authored-by: Ines Montani <ines@ines.io> 
						
					 
					
						2019-07-09 22:23:16 +02:00