mirror of
https://github.com/explosion/spaCy.git
synced 2025-01-27 17:54:39 +03:00
Remove LEMMA from exception examples [ci skip]
This commit is contained in:
parent
82c16b7943
commit
25b2b3ff45
|
@ -118,8 +118,8 @@ and examples.
|
||||||
> #### Example
|
> #### Example
|
||||||
>
|
>
|
||||||
> ```python
|
> ```python
|
||||||
> from spacy.attrs import ORTH, LEMMA
|
> from spacy.attrs import ORTH, NORM
|
||||||
> case = [{ORTH: "do"}, {ORTH: "n't", LEMMA: "not"}]
|
> case = [{ORTH: "do"}, {ORTH: "n't", NORM: "not"}]
|
||||||
> tokenizer.add_special_case("don't", case)
|
> tokenizer.add_special_case("don't", case)
|
||||||
> ```
|
> ```
|
||||||
|
|
||||||
|
|
|
@ -514,9 +514,9 @@ an error if key doesn't match `ORTH` values.
|
||||||
>
|
>
|
||||||
> ```python
|
> ```python
|
||||||
> BASE = {"a.": [{ORTH: "a."}], ":)": [{ORTH: ":)"}]}
|
> BASE = {"a.": [{ORTH: "a."}], ":)": [{ORTH: ":)"}]}
|
||||||
> NEW = {"a.": [{ORTH: "a.", LEMMA: "all"}]}
|
> NEW = {"a.": [{ORTH: "a.", NORM: "all"}]}
|
||||||
> exceptions = util.update_exc(BASE, NEW)
|
> exceptions = util.update_exc(BASE, NEW)
|
||||||
> # {"a.": [{ORTH: "a.", LEMMA: "all"}], ":)": [{ORTH: ":)"}]}
|
> # {"a.": [{ORTH: "a.", NORM: "all"}], ":)": [{ORTH: ":)"}]}
|
||||||
> ```
|
> ```
|
||||||
|
|
||||||
| Name | Type | Description |
|
| Name | Type | Description |
|
||||||
|
|
|
@ -649,7 +649,7 @@ import Tokenization101 from 'usage/101/\_tokenization.md'
|
||||||
data in
|
data in
|
||||||
[`spacy/lang`](https://github.com/explosion/spaCy/tree/master/spacy/lang). The
|
[`spacy/lang`](https://github.com/explosion/spaCy/tree/master/spacy/lang). The
|
||||||
tokenizer exceptions define special cases like "don't" in English, which needs
|
tokenizer exceptions define special cases like "don't" in English, which needs
|
||||||
to be split into two tokens: `{ORTH: "do"}` and `{ORTH: "n't", LEMMA: "not"}`.
|
to be split into two tokens: `{ORTH: "do"}` and `{ORTH: "n't", NORM: "not"}`.
|
||||||
The prefixes, suffixes and infixes mostly define punctuation rules – for
|
The prefixes, suffixes and infixes mostly define punctuation rules – for
|
||||||
example, when to split off periods (at the end of a sentence), and when to leave
|
example, when to split off periods (at the end of a sentence), and when to leave
|
||||||
tokens containing periods intact (abbreviations like "U.S.").
|
tokens containing periods intact (abbreviations like "U.S.").
|
||||||
|
|
Loading…
Reference in New Issue
Block a user