Update TIGER link and tag description (#6344)

This commit is contained in:
Adriane Boyd 2020-11-05 09:33:00 +01:00
parent 58a7461cff
commit e4c3d6748c

View File

@ -51,11 +51,11 @@ rely on simple lookup tables.
<Infobox title="About spaCy's custom pronoun lemma for English" variant="warning"> <Infobox title="About spaCy's custom pronoun lemma for English" variant="warning">
spaCy adds a **special case for English pronouns**: all English pronouns are spaCy adds a **special case for English pronouns**: all English pronouns are
lemmatized to the special token `-PRON-`. Unlike verbs and common nouns, lemmatized to the special token `-PRON-`. Unlike verbs and common nouns, there's
there's no clear base form of a personal pronoun. Should the lemma of "me" be no clear base form of a personal pronoun. Should the lemma of "me" be "I", or
"I", or should we normalize person as well, giving "it" — or maybe "he"? should we normalize person as well, giving "it" — or maybe "he"? spaCy's
spaCy's solution is to introduce a novel symbol, `-PRON-`, which is used as the solution is to introduce a novel symbol, `-PRON-`, which is used as the lemma
lemma for all personal pronouns. for all personal pronouns.
</Infobox> </Infobox>
@ -121,7 +121,7 @@ Treebank tag set. We also map the tags to the simpler Universal Dependencies v2
POS tag set. POS tag set.
| Tag |  POS | Morphology | Description | | Tag |  POS | Morphology | Description |
| ------------------------------------- | ------- | --------------------------------------- | ----------------------------------------- | | ----------------------------------- | ------- | -------------------------------------------------- | ----------------------------------------- |
| `$` | `SYM` | | symbol, currency | | `$` | `SYM` | | symbol, currency |
| <InlineCode>&#96;&#96;</InlineCode> | `PUNCT` | `PunctType=quot PunctSide=ini` | opening quotation mark | | <InlineCode>&#96;&#96;</InlineCode> | `PUNCT` | `PunctType=quot PunctSide=ini` | opening quotation mark |
| `''` | `PUNCT` | `PunctType=quot PunctSide=fin` | closing quotation mark | | `''` | `PUNCT` | `PunctType=quot PunctSide=fin` | closing quotation mark |
@ -175,14 +175,15 @@ POS tag set.
| `WRB` | `ADV` | | wh-adverb | | `WRB` | `ADV` | | wh-adverb |
| `XX` | `X` | | unknown | | `XX` | `X` | | unknown |
| `_SP` | `SPACE` | | | | `_SP` | `SPACE` | | |
</Accordion> </Accordion>
<Accordion title="German" id="pos-de"> <Accordion title="German" id="pos-de">
The German part-of-speech tagger uses the The German part-of-speech tagger uses the
[TIGER Treebank](http://www.ims.uni-stuttgart.de/forschung/ressourcen/korpora/TIGERCorpus/annotation/index.html) [TIGER Treebank](https://www.ims.uni-stuttgart.de/forschung/ressourcen/korpora/tiger/)
annotation scheme. We also map the tags to the simpler Universal Dependencies annotation scheme. We also map the tags to the simpler Universal Dependencies v2
v2 POS tag set. POS tag set.
| Tag |  POS | Morphology | Description | | Tag |  POS | Morphology | Description |
| --------- | ------- | ---------------------------------------- | ------------------------------------------------- | | --------- | ------- | ---------------------------------------- | ------------------------------------------------- |
@ -211,7 +212,7 @@ v2 POS tag set.
| `PDS` | `PRON` | `PronType=dem` | substituting demonstrative pronoun | | `PDS` | `PRON` | `PronType=dem` | substituting demonstrative pronoun |
| `PIAT` | `DET` | `PronType=ind|neg|tot` | attributive indefinite pronoun without determiner | | `PIAT` | `DET` | `PronType=ind|neg|tot` | attributive indefinite pronoun without determiner |
| `PIS` | `PRON` | `PronType=ind|neg|tot` | substituting indefinite pronoun | | `PIS` | `PRON` | `PronType=ind|neg|tot` | substituting indefinite pronoun |
| `PPER` | `PRON` | `PronType=prs` | non-reflexive personal pronoun | | `PPER` | `PRON` | `PronType=prs` | replaceable personal pronoun |
| `PPOSAT` | `DET` | `Poss=yes PronType=prs` | attributive possessive pronoun | | `PPOSAT` | `DET` | `Poss=yes PronType=prs` | attributive possessive pronoun |
| `PPOSS` | `PRON` | `Poss=yes PronType=prs` | substituting possessive pronoun | | `PPOSS` | `PRON` | `Poss=yes PronType=prs` | substituting possessive pronoun |
| `PRELAT` | `DET` | `PronType=rel` | attributive relative pronoun | | `PRELAT` | `DET` | `PronType=rel` | attributive relative pronoun |
@ -241,6 +242,7 @@ v2 POS tag set.
| `VVPP` | `VERB` | `Aspect=perf VerbForm=part` | perfect participle, full | | `VVPP` | `VERB` | `Aspect=perf VerbForm=part` | perfect participle, full |
| `XY` | `X` | | non-word containing non-letter | | `XY` | `X` | | non-word containing non-letter |
| `_SP` | `SPACE` | | | | `_SP` | `SPACE` | | |
</Accordion> </Accordion>
--- ---