Add documentation for spaCy's JSON format

This commit is contained in:
ines 2017-03-26 15:56:15 +02:00
parent 007a2492bd
commit 13df2d6a60
2 changed files with 32 additions and 1 deletions

View File

@ -79,3 +79,33 @@ p
+h(2, "named-entities") Named Entity Recognition
include _annotation/_named-entities
+h(2, "json-input") JSON input format for training
p
| spaCy takes training data in the following format:
+code("Example structure").
doc: {
id: string,
paragraphs: [{
raw: string,
sents: [int],
tokens: [{
start: int,
tag: string,
head: int,
dep: string
}],
ner: [{
start: int,
end: int,
label: string
}],
brackets: [{
start: int,
end: int,
label: string
}]
}]
}

View File

@ -143,7 +143,8 @@ p
+tag experimental
p
| Train a model. Expects data in spaCy's JSON format.
| Train a model. Expects data in spaCy's
| #[+a("/docs/api/annotation#json-input") JSON format].
+code(false, "bash").
python -m spacy train [lang] [output_dir] [train_data] [dev_data] [--n_iter] [--parser_L1] [--no_tagger] [--no_parser] [--no_ner]