From c9f9203d5f8ac535d3f5104d8a9883c441eb886b Mon Sep 17 00:00:00 2001 From: "M. Z. Ferdous (Imran)" <1205081.mzfs@ugrad.cse.buet.ac.bd> Date: Thu, 27 Apr 2017 16:48:54 +0600 Subject: [PATCH 1/2] fix typo, CONLL format tried to google about connlu format. Saw there is conll format, not connlu. --- website/docs/usage/adding-languages.jade | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/website/docs/usage/adding-languages.jade b/website/docs/usage/adding-languages.jade index 67ac8d610..03a1eae43 100644 --- a/website/docs/usage/adding-languages.jade +++ b/website/docs/usage/adding-languages.jade @@ -544,7 +544,7 @@ p p | You can now train the model using a corpus for your language annotated | with #[+a("http://universaldependencies.org/") Universal Dependencies]. - | If your corpus uses the connlu format, you can use the + | If your corpus uses the CONLL format, you can use the | #[+a("/docs/usage/cli#convert") #[code convert] command] to convert it to | spaCy's #[+a("/docs/api/annotation#json-input") JSON format] for training. From fb96f88b59bfafc77749268c2bd34b6dc65e5a5d Mon Sep 17 00:00:00 2001 From: Ines Montani Date: Thu, 27 Apr 2017 14:36:08 +0200 Subject: [PATCH 2/2] Update info on CoNLL format and include link --- website/docs/usage/adding-languages.jade | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-) diff --git a/website/docs/usage/adding-languages.jade b/website/docs/usage/adding-languages.jade index 03a1eae43..30c4486b0 100644 --- a/website/docs/usage/adding-languages.jade +++ b/website/docs/usage/adding-languages.jade @@ -544,7 +544,9 @@ p p | You can now train the model using a corpus for your language annotated | with #[+a("http://universaldependencies.org/") Universal Dependencies]. - | If your corpus uses the CONLL format, you can use the + | If your corpus uses the + | #[+a("http://universaldependencies.org/docs/format.html") CoNLL-U] format, + | i.e. files with the extension #[code .conllu], you can use the | #[+a("/docs/usage/cli#convert") #[code convert] command] to convert it to | spaCy's #[+a("/docs/api/annotation#json-input") JSON format] for training.