mirror of
https://github.com/explosion/spaCy.git
synced 2025-01-08 16:26:37 +03:00
Update TIGER link and tag description (#6344)
This commit is contained in:
parent
58a7461cff
commit
e4c3d6748c
|
@ -51,11 +51,11 @@ rely on simple lookup tables.
|
||||||
<Infobox title="About spaCy's custom pronoun lemma for English" variant="warning">
|
<Infobox title="About spaCy's custom pronoun lemma for English" variant="warning">
|
||||||
|
|
||||||
spaCy adds a **special case for English pronouns**: all English pronouns are
|
spaCy adds a **special case for English pronouns**: all English pronouns are
|
||||||
lemmatized to the special token `-PRON-`. Unlike verbs and common nouns,
|
lemmatized to the special token `-PRON-`. Unlike verbs and common nouns, there's
|
||||||
there's no clear base form of a personal pronoun. Should the lemma of "me" be
|
no clear base form of a personal pronoun. Should the lemma of "me" be "I", or
|
||||||
"I", or should we normalize person as well, giving "it" — or maybe "he"?
|
should we normalize person as well, giving "it" — or maybe "he"? spaCy's
|
||||||
spaCy's solution is to introduce a novel symbol, `-PRON-`, which is used as the
|
solution is to introduce a novel symbol, `-PRON-`, which is used as the lemma
|
||||||
lemma for all personal pronouns.
|
for all personal pronouns.
|
||||||
|
|
||||||
</Infobox>
|
</Infobox>
|
||||||
|
|
||||||
|
@ -121,7 +121,7 @@ Treebank tag set. We also map the tags to the simpler Universal Dependencies v2
|
||||||
POS tag set.
|
POS tag set.
|
||||||
|
|
||||||
| Tag | POS | Morphology | Description |
|
| Tag | POS | Morphology | Description |
|
||||||
| ------------------------------------- | ------- | --------------------------------------- | ----------------------------------------- |
|
| ----------------------------------- | ------- | -------------------------------------------------- | ----------------------------------------- |
|
||||||
| `$` | `SYM` | | symbol, currency |
|
| `$` | `SYM` | | symbol, currency |
|
||||||
| <InlineCode>``</InlineCode> | `PUNCT` | `PunctType=quot PunctSide=ini` | opening quotation mark |
|
| <InlineCode>``</InlineCode> | `PUNCT` | `PunctType=quot PunctSide=ini` | opening quotation mark |
|
||||||
| `''` | `PUNCT` | `PunctType=quot PunctSide=fin` | closing quotation mark |
|
| `''` | `PUNCT` | `PunctType=quot PunctSide=fin` | closing quotation mark |
|
||||||
|
@ -175,14 +175,15 @@ POS tag set.
|
||||||
| `WRB` | `ADV` | | wh-adverb |
|
| `WRB` | `ADV` | | wh-adverb |
|
||||||
| `XX` | `X` | | unknown |
|
| `XX` | `X` | | unknown |
|
||||||
| `_SP` | `SPACE` | | |
|
| `_SP` | `SPACE` | | |
|
||||||
|
|
||||||
</Accordion>
|
</Accordion>
|
||||||
|
|
||||||
<Accordion title="German" id="pos-de">
|
<Accordion title="German" id="pos-de">
|
||||||
|
|
||||||
The German part-of-speech tagger uses the
|
The German part-of-speech tagger uses the
|
||||||
[TIGER Treebank](http://www.ims.uni-stuttgart.de/forschung/ressourcen/korpora/TIGERCorpus/annotation/index.html)
|
[TIGER Treebank](https://www.ims.uni-stuttgart.de/forschung/ressourcen/korpora/tiger/)
|
||||||
annotation scheme. We also map the tags to the simpler Universal Dependencies
|
annotation scheme. We also map the tags to the simpler Universal Dependencies v2
|
||||||
v2 POS tag set.
|
POS tag set.
|
||||||
|
|
||||||
| Tag | POS | Morphology | Description |
|
| Tag | POS | Morphology | Description |
|
||||||
| --------- | ------- | ---------------------------------------- | ------------------------------------------------- |
|
| --------- | ------- | ---------------------------------------- | ------------------------------------------------- |
|
||||||
|
@ -211,7 +212,7 @@ v2 POS tag set.
|
||||||
| `PDS` | `PRON` | `PronType=dem` | substituting demonstrative pronoun |
|
| `PDS` | `PRON` | `PronType=dem` | substituting demonstrative pronoun |
|
||||||
| `PIAT` | `DET` | `PronType=ind|neg|tot` | attributive indefinite pronoun without determiner |
|
| `PIAT` | `DET` | `PronType=ind|neg|tot` | attributive indefinite pronoun without determiner |
|
||||||
| `PIS` | `PRON` | `PronType=ind|neg|tot` | substituting indefinite pronoun |
|
| `PIS` | `PRON` | `PronType=ind|neg|tot` | substituting indefinite pronoun |
|
||||||
| `PPER` | `PRON` | `PronType=prs` | non-reflexive personal pronoun |
|
| `PPER` | `PRON` | `PronType=prs` | replaceable personal pronoun |
|
||||||
| `PPOSAT` | `DET` | `Poss=yes PronType=prs` | attributive possessive pronoun |
|
| `PPOSAT` | `DET` | `Poss=yes PronType=prs` | attributive possessive pronoun |
|
||||||
| `PPOSS` | `PRON` | `Poss=yes PronType=prs` | substituting possessive pronoun |
|
| `PPOSS` | `PRON` | `Poss=yes PronType=prs` | substituting possessive pronoun |
|
||||||
| `PRELAT` | `DET` | `PronType=rel` | attributive relative pronoun |
|
| `PRELAT` | `DET` | `PronType=rel` | attributive relative pronoun |
|
||||||
|
@ -241,6 +242,7 @@ v2 POS tag set.
|
||||||
| `VVPP` | `VERB` | `Aspect=perf VerbForm=part` | perfect participle, full |
|
| `VVPP` | `VERB` | `Aspect=perf VerbForm=part` | perfect participle, full |
|
||||||
| `XY` | `X` | | non-word containing non-letter |
|
| `XY` | `X` | | non-word containing non-letter |
|
||||||
| `_SP` | `SPACE` | | |
|
| `_SP` | `SPACE` | | |
|
||||||
|
|
||||||
</Accordion>
|
</Accordion>
|
||||||
|
|
||||||
---
|
---
|
||||||
|
|
Loading…
Reference in New Issue
Block a user