Small doc typos (#10750)

* fix typos

* formatting
This commit is contained in:
Sofie Van Landeghem 2022-05-03 13:55:27 +02:00 committed by GitHub
parent f5390e278a
commit e03b9f8095
No known key found for this signature in database
GPG Key ID: 4AEE18F83AFDEB23

View File

@ -949,7 +949,7 @@ for match_id, start, end in matcher(doc):
The examples here use [`nlp.make_doc`](/api/language#make_doc) to create `Doc` The examples here use [`nlp.make_doc`](/api/language#make_doc) to create `Doc`
object patterns as efficiently as possible and without running any of the other object patterns as efficiently as possible and without running any of the other
pipeline components. If the token attribute you want to match on are set by a pipeline components. If the token attribute you want to match on is set by a
pipeline component, **make sure that the pipeline component runs** when you pipeline component, **make sure that the pipeline component runs** when you
create the pattern. For example, to match on `POS` or `LEMMA`, the pattern `Doc` create the pattern. For example, to match on `POS` or `LEMMA`, the pattern `Doc`
objects need to have part-of-speech tags set by the `tagger` or `morphologizer`. objects need to have part-of-speech tags set by the `tagger` or `morphologizer`.
@ -960,9 +960,9 @@ disable components selectively.
</Infobox> </Infobox>
Another possible use case is matching number tokens like IP addresses based on Another possible use case is matching number tokens like IP addresses based on
their shape. This means that you won't have to worry about how those string will their shape. This means that you won't have to worry about how those strings
be tokenized and you'll be able to find tokens and combinations of tokens based will be tokenized and you'll be able to find tokens and combinations of tokens
on a few examples. Here, we're matching on the shapes `ddd.d.d.d` and based on a few examples. Here, we're matching on the shapes `ddd.d.d.d` and
`ddd.ddd.d.d`: `ddd.ddd.d.d`:
```python ```python
@ -1433,7 +1433,7 @@ of `"phrase_matcher_attr": "POS"` for the entity ruler.
Running the full language pipeline across every pattern in a large list scales Running the full language pipeline across every pattern in a large list scales
linearly and can therefore take a long time on large amounts of phrase patterns. linearly and can therefore take a long time on large amounts of phrase patterns.
As of spaCy v2.2.4 the `add_patterns` function has been refactored to use As of spaCy v2.2.4 the `add_patterns` function has been refactored to use
nlp.pipe on all phrase patterns resulting in about a 10x-20x speed up with `nlp.pipe` on all phrase patterns resulting in about a 10x-20x speed up with
5,000-100,000 phrase patterns respectively. Even with this speedup (but 5,000-100,000 phrase patterns respectively. Even with this speedup (but
especially if you're using an older version) the `add_patterns` function can especially if you're using an older version) the `add_patterns` function can
still take a long time. An easy workaround to make this function run faster is still take a long time. An easy workaround to make this function run faster is