Merge pull request #7492 from adrianeboyd/bugfix/ux-matcher-attributes

Update matcher errors and docs
This commit is contained in:
Ines Montani 2021-03-21 02:17:13 +01:00 committed by GitHub
commit e3c3dbdb15
No known key found for this signature in database
GPG Key ID: 4AEE18F83AFDEB23
3 changed files with 11 additions and 8 deletions

View File

@ -202,6 +202,8 @@ cdef class Matcher:
doclike (Doc or Span): The document to match over.
as_spans (bool): Return Span objects with labels instead of (match_id,
start, end) tuples.
allow_missing (bool): Whether to skip checks for missing annotation for
attributes included in patterns. Defaults to False.
RETURNS (list): A list of `(match_id, start, end)` tuples,
describing the matches. A match tuple describes a span
`doc[start:end]`. The `match_id` is an integer. If as_spans is set
@ -222,7 +224,7 @@ cdef class Matcher:
if attr == TAG:
pipe = "tagger"
elif attr in (POS, MORPH):
pipe = "morphologizer"
pipe = "morphologizer or tagger+attribute_ruler"
elif attr == LEMMA:
pipe = "lemmatizer"
elif attr == DEP:

View File

@ -194,7 +194,7 @@ cdef class PhraseMatcher:
if attr == TAG:
pipe = "tagger"
elif attr in (POS, MORPH):
pipe = "morphologizer"
pipe = "morphologizer or tagger+attribute_ruler"
elif attr == LEMMA:
pipe = "lemmatizer"
elif attr == DEP:

View File

@ -120,12 +120,13 @@ Find all token sequences matching the supplied patterns on the `Doc` or `Span`.
> matches = matcher(doc)
> ```
| Name | Description |
| ------------------------------------- | -------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
| `doclike` | The `Doc` or `Span` to match over. ~~Union[Doc, Span]~~ |
| _keyword-only_ | |
| `as_spans` <Tag variant="new">3</Tag> | Instead of tuples, return a list of [`Span`](/api/span) objects of the matches, with the `match_id` assigned as the span label. Defaults to `False`. ~~bool~~ |
| **RETURNS** | A list of `(match_id, start, end)` tuples, describing the matches. A match tuple describes a span `doc[start:end`]. The `match_id` is the ID of the added match pattern. If `as_spans` is set to `True`, a list of `Span` objects is returned instead. ~~Union[List[Tuple[int, int, int]], List[Span]]~~ |
| Name | Description |
| ------------------------------------------ | -------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
| `doclike` | The `Doc` or `Span` to match over. ~~Union[Doc, Span]~~ |
| _keyword-only_ | |
| `as_spans` <Tag variant="new">3</Tag> | Instead of tuples, return a list of [`Span`](/api/span) objects of the matches, with the `match_id` assigned as the span label. Defaults to `False`. ~~bool~~ |
| `allow_missing` <Tag variant="new">3</Tag> | Whether to skip checks for missing annotation for attributes included in patterns. Defaults to `False`. ~~bool~~ |
| **RETURNS** | A list of `(match_id, start, end)` tuples, describing the matches. A match tuple describes a span `doc[start:end`]. The `match_id` is the ID of the added match pattern. If `as_spans` is set to `True`, a list of `Span` objects is returned instead. ~~Union[List[Tuple[int, int, int]], List[Span]]~~ |
## Matcher.\_\_len\_\_ {#len tag="method" new="2"}