Update Tokenizer documentation to reflect token_match and url_match signatures (#9859)

2025-11-04 09:57:26 +03:00 · 2021-12-15 10:34:33 +02:00 · 2021-12-15 10:34:33 +02:00 · ac45ae3779
commit ac45ae3779
parent ba0fa7a64e
1 changed files with 5 additions and 3 deletions
--- a/spacy/tokenizer.pyx
+++ b/spacy/tokenizer.pyx
@ -45,10 +45,12 @@ cdef class Tokenizer:
            `re.compile(string).search` to match suffixes.
        `infix_finditer` (callable): A function matching the signature of
            `re.compile(string).finditer` to find infixes.
-        token_match (callable): A boolean function matching strings to be
+        token_match (callable): A function matching the signature of
+            `re.compile(string).match`, for matching strings to be
            recognized as tokens.
-        url_match (callable): A boolean function matching strings to be
-            recognized as tokens after considering prefixes and suffixes.
+        url_match (callable): A function matching the signature of
+            `re.compile(string).match`, for matching strings to be
+            recognized as urls.

        EXAMPLE:
            >>> tokenizer = Tokenizer(nlp.vocab)