mirror of
				https://github.com/explosion/spaCy.git
				synced 2025-11-04 09:57:26 +03:00 
			
		
		
		
	Update Tokenizer documentation to reflect token_match and url_match signatures (#9859)
This commit is contained in:
		
							parent
							
								
									ba0fa7a64e
								
							
						
					
					
						commit
						ac45ae3779
					
				| 
						 | 
				
			
			@ -45,10 +45,12 @@ cdef class Tokenizer:
 | 
			
		|||
            `re.compile(string).search` to match suffixes.
 | 
			
		||||
        `infix_finditer` (callable): A function matching the signature of
 | 
			
		||||
            `re.compile(string).finditer` to find infixes.
 | 
			
		||||
        token_match (callable): A boolean function matching strings to be
 | 
			
		||||
        token_match (callable): A function matching the signature of
 | 
			
		||||
            `re.compile(string).match`, for matching strings to be
 | 
			
		||||
            recognized as tokens.
 | 
			
		||||
        url_match (callable): A boolean function matching strings to be
 | 
			
		||||
            recognized as tokens after considering prefixes and suffixes.
 | 
			
		||||
        url_match (callable): A function matching the signature of
 | 
			
		||||
            `re.compile(string).match`, for matching strings to be
 | 
			
		||||
            recognized as urls.
 | 
			
		||||
 | 
			
		||||
        EXAMPLE:
 | 
			
		||||
            >>> tokenizer = Tokenizer(nlp.vocab)
 | 
			
		||||
| 
						 | 
				
			
			
 | 
			
		|||
		Loading…
	
		Reference in New Issue
	
	Block a user