spaCy/website/api/_annotation/_named-entities.jade
ines 5453821a9f Update NER annotation scheme
Add note on training data sources and include coarse-grained Wikipedia scheme
2017-10-30 13:53:49 +01:00

110 lines
2.7 KiB
Plaintext

//- 💫 DOCS > API > ANNOTATION > NAMED ENTITIES
p
| Models trained on the
| #[+a("https://catalog.ldc.upenn.edu/ldc2013t19") OntoNotes 5] corpus
| support the following entity types:
+table(["Type", "Description"])
+row
+cell #[code PERSON]
+cell People, including fictional.
+row
+cell #[code NORP]
+cell Nationalities or religious or political groups.
+row
+cell #[code FACILITY]
+cell Buildings, airports, highways, bridges, etc.
+row
+cell #[code ORG]
+cell Companies, agencies, institutions, etc.
+row
+cell #[code GPE]
+cell Countries, cities, states.
+row
+cell #[code LOC]
+cell Non-GPE locations, mountain ranges, bodies of water.
+row
+cell #[code PRODUCT]
+cell Objects, vehicles, foods, etc. (Not services.)
+row
+cell #[code EVENT]
+cell Named hurricanes, battles, wars, sports events, etc.
+row
+cell #[code WORK_OF_ART]
+cell Titles of books, songs, etc.
+row
+cell #[code LAW]
+cell Named documents made into laws.
+row
+cell #[code LANGUAGE]
+cell Any named language.
+row
+cell #[code DATE]
+cell Absolute or relative dates or periods.
+row
+cell #[code TIME]
+cell Times smaller than a day.
+row
+cell #[code PERCENT]
+cell Percentage, including "%".
+row
+cell #[code MONEY]
+cell Monetary values, including unit.
+row
+cell #[code QUANTITY]
+cell Measurements, as of weight or distance.
+row
+cell #[code ORDINAL]
+cell "first", "second", etc.
+row
+cell #[code CARDINAL]
+cell Numerals that do not fall under another type.
+h(4, "ner-wikipedia-scheme") Wikipedia scheme
p
| Models trained on Wikipedia corpus
| (#[+a("http://www.sciencedirect.com/science/article/pii/S0004370212000276") Nothman et al., 2013])
| use a less fine-grained NER annotation scheme and recognise the
| following entities:
+table(["Type", "Description"])
+row
+cell #[code PER]
+cell Named person or family.
+row
+cell #[code LOC]
+cell
| Name of politically or geographically defined location (cities,
| provinces, countries, international regions, bodies of water,
| mountains).
+row
+cell #[code ORG]
+cell Named corporate, governmental, or other organizational entity.
+row
+cell #[code MISC]
+cell
| Miscellaneous entities, e.g. events, nationalities, products or
| works of art.