mirror of
https://github.com/explosion/spaCy.git
synced 2025-01-25 00:34:20 +03:00
Remove LEMMA from exception examples [ci skip]
This commit is contained in:
parent
82c16b7943
commit
25b2b3ff45
|
@ -118,8 +118,8 @@ and examples.
|
|||
> #### Example
|
||||
>
|
||||
> ```python
|
||||
> from spacy.attrs import ORTH, LEMMA
|
||||
> case = [{ORTH: "do"}, {ORTH: "n't", LEMMA: "not"}]
|
||||
> from spacy.attrs import ORTH, NORM
|
||||
> case = [{ORTH: "do"}, {ORTH: "n't", NORM: "not"}]
|
||||
> tokenizer.add_special_case("don't", case)
|
||||
> ```
|
||||
|
||||
|
|
|
@ -514,9 +514,9 @@ an error if key doesn't match `ORTH` values.
|
|||
>
|
||||
> ```python
|
||||
> BASE = {"a.": [{ORTH: "a."}], ":)": [{ORTH: ":)"}]}
|
||||
> NEW = {"a.": [{ORTH: "a.", LEMMA: "all"}]}
|
||||
> NEW = {"a.": [{ORTH: "a.", NORM: "all"}]}
|
||||
> exceptions = util.update_exc(BASE, NEW)
|
||||
> # {"a.": [{ORTH: "a.", LEMMA: "all"}], ":)": [{ORTH: ":)"}]}
|
||||
> # {"a.": [{ORTH: "a.", NORM: "all"}], ":)": [{ORTH: ":)"}]}
|
||||
> ```
|
||||
|
||||
| Name | Type | Description |
|
||||
|
|
|
@ -649,7 +649,7 @@ import Tokenization101 from 'usage/101/\_tokenization.md'
|
|||
data in
|
||||
[`spacy/lang`](https://github.com/explosion/spaCy/tree/master/spacy/lang). The
|
||||
tokenizer exceptions define special cases like "don't" in English, which needs
|
||||
to be split into two tokens: `{ORTH: "do"}` and `{ORTH: "n't", LEMMA: "not"}`.
|
||||
to be split into two tokens: `{ORTH: "do"}` and `{ORTH: "n't", NORM: "not"}`.
|
||||
The prefixes, suffixes and infixes mostly define punctuation rules – for
|
||||
example, when to split off periods (at the end of a sentence), and when to leave
|
||||
tokens containing periods intact (abbreviations like "U.S.").
|
||||
|
|
Loading…
Reference in New Issue
Block a user