mirror of
				https://github.com/explosion/spaCy.git
				synced 2025-11-04 01:48:04 +03:00 
			
		
		
		
	* Adapt tokenization methods from `pyvi` to preserve text encoding and whitespace * Add serialization support similar to Chinese and Japanese Note: as for Chinese and Japanese, some settings are duplicated in `config.cfg` and `tokenizer/cfg`.
		
			
				
	
	
		
			0 lines
		
	
	
	
	
		
			Python
		
	
	
	
	
	
			
		
		
	
	
			0 lines
		
	
	
	
	
		
			Python