Add training 101

2026-01-11 11:11:13 +03:00 · 2017-06-01 11:53:16 +02:00 · 2017-06-01 11:53:16 +02:00 · 2f40d6e7e7
commit 2f40d6e7e7
parent abed463bbb
2 changed files with 56 additions and 1 deletions
--- a/website/docs/usage/_spacy-101/_training.jade
+++ b/website/docs/usage/_spacy-101/_training.jade
@ -1,3 +1,52 @@
 //- 💫 DOCS > USAGE > SPACY 101 > TRAINING

-+under-construction
+p
+    |  spaCy's models are #[strong statistical] and every "decision" they make –
+    |  for example, which part-of-speech tag to assign, or whether a word is a
+    |  named entity – is a #[strong prediction]. This prediction is based
+    |  on the examples the model has seen during #[strong training]. To train
+    |  a model, you first need training data – examples of text, and the
+    |  labels you want the model to predict. This could be a part-of-speech tag,
+    |  a named entity or any other information.
+
+p
+    |  The model is then shown the unlabelled text and will make a prediction.
+    |  Because we know the correct answer, we can give the model feedback on its
+    |  prediction in the form of an #[strong error gradient] of the
+    |  #[strong loss function] that calculates the difference between the training
+    |  example and the expected output. The greater the difference, the more
+    |  significant the gradient and the updates to our model.
+
+aside
+    |  #[strong Training data:] Examples and their annotations.#[br]
+    |  #[strong Text:] The input text the model should predict a label for.#[br]
+    |  #[strong Label:] The label the model should predict.#[br]
+    |  #[strong Gradient:] Gradient of the loss function calculating the
+    |  difference between input and expected output.
+
+image
+    include ../../../assets/img/docs/training.svg
+    .u-text-right
+        +button("/assets/img/docs/training.svg", false, "secondary").u-text-tag View large graphic
+
+p
+    |  When training a model, we don't just want it to memorise our examples –
+    |  we want it to come up with theory that can be
+    |  #[strong generalised across other examples]. After all, we don't just want
+    |  the model to learn that this one instance of "Amazon" right here is a
+    |  company – we want it to learn that "Amazon", in contexts #[em like this],
+    |  is most likely a company. That's why the training data should always be
+    |  representative of the data we want to process. A model trained on
+    |  Wikipedia, where sentences in the first person are extremely rare, will
+    |  likely perform badly on Twitter. Similarly, a model trained on romantic
+    |  novels will likely perform badly on legal text.
+
+p
+    |  This also means that in order to know how the model is performing,
+    |  and whether it's learning the right things, you don't only need
+    |  #[strong training data] – you'll also need #[strong evaluation data]. If
+    |  you only test the model with the data it was trained on, you'll have no
+    |  idea how well it's generalising. If you want to train a model from scratch,
+    |  you usually need at least a few hundred examples for both training and
+    |  evaluation. To update an existing model, you can already achieve decent
+    |  results with very few examples – as long as they're representative.
--- a/website/docs/usage/spacy-101.jade
+++ b/website/docs/usage/spacy-101.jade
@ -252,6 +252,12 @@ include _spacy-101/_serialization

 include _spacy-101/_training

+infobox
+    |  To learn more about #[strong training and updating] models, how to create
+    |  training data and how to improve spaCy's named entity recognition models,
+    |  see the usage guides on #[+a("/docs/usage/training") training] and
+    |  #[+a("/docs/usage/training-ner") training the named entity recognizer].
+
 +h(2, "architecture") Architecture

 +under-construction