mirror of
https://github.com/explosion/spaCy.git
synced 2024-12-25 01:16:28 +03:00
Update Tokenizer documentation to reflect token_match and url_match signatures (#9859)
This commit is contained in:
parent
ba0fa7a64e
commit
ac45ae3779
|
@ -45,10 +45,12 @@ cdef class Tokenizer:
|
|||
`re.compile(string).search` to match suffixes.
|
||||
`infix_finditer` (callable): A function matching the signature of
|
||||
`re.compile(string).finditer` to find infixes.
|
||||
token_match (callable): A boolean function matching strings to be
|
||||
token_match (callable): A function matching the signature of
|
||||
`re.compile(string).match`, for matching strings to be
|
||||
recognized as tokens.
|
||||
url_match (callable): A boolean function matching strings to be
|
||||
recognized as tokens after considering prefixes and suffixes.
|
||||
url_match (callable): A function matching the signature of
|
||||
`re.compile(string).match`, for matching strings to be
|
||||
recognized as urls.
|
||||
|
||||
EXAMPLE:
|
||||
>>> tokenizer = Tokenizer(nlp.vocab)
|
||||
|
|
Loading…
Reference in New Issue
Block a user