From 91dbee1b8fc86e1c9678ede0404711142d8ac628 Mon Sep 17 00:00:00 2001 From: ines Date: Tue, 24 Oct 2017 16:17:03 +0200 Subject: [PATCH] Add BILUO docs to NER annotation scheme --- website/docs/api/annotation.jade | 25 +++++++++++++++++++++++++ 1 file changed, 25 insertions(+) diff --git a/website/docs/api/annotation.jade b/website/docs/api/annotation.jade index d4b01a819..155c4d13b 100644 --- a/website/docs/api/annotation.jade +++ b/website/docs/api/annotation.jade @@ -86,6 +86,31 @@ include _annotation/_dep-labels include _annotation/_named-entities + | showed that the minimal #[strong Begin], #[strong In], #[strong Out] + | scheme was more difficult to learn than the #[strong BILUO] scheme that + | we use, which explicitly marks boundary tokens. + ++table(["Tag", "Description"]) + +row + +cell #[code #[span.u-color-theme B] EGIN] + +cell The first token of a multi-token entity. + + +row + +cell #[code #[span.u-color-theme I] N] + +cell An inner token of a multi-token entity. + + +row + +cell #[code #[span.u-color-theme L] AST] + +cell The final token of a multi-token entity. + + +row + +cell #[code #[span.u-color-theme U] NIT] + +cell A single-token entity. + + +row + +cell #[code #[span.u-color-theme O] UT] + +cell A non-entity token. + +h(2, "json-input") JSON input format for training p