Update NER training draft

This commit is contained in:
ines 2017-06-01 12:51:36 +02:00
parent 04fac3f52a
commit 8274dffad6

View File

@ -8,22 +8,23 @@ p
| particularly useful as a "quick and dirty solution", if you have only a | particularly useful as a "quick and dirty solution", if you have only a
| few corrections or annotations. | few corrections or annotations.
+under-construction
+h(2, "improving-accuracy") Improving accuracy on existing entity types +h(2, "improving-accuracy") Improving accuracy on existing entity types
p p
| To update the model, you first need to create an instance of | To update the model, you first need to create an instance of
| #[+api("goldparse") #[code spacy.gold.GoldParse]], with the entity labels | #[+api("goldparse") #[code GoldParse]], with the entity labels
| you want to learn. You will then pass this instance to the | you want to learn. You'll usually need to provide many examples to
| #[+api("entityrecognizer#update") #[code EntityRecognizer.update()]] | meaningfully improve the system — a few hundred is a good start, although
| method. | more is better.
+image
include ../../assets/img/docs/training-loop.svg
.u-text-right
+button("/assets/img/docs/training-loop.svg", false, "secondary").u-text-tag View large graphic
p p
| You'll usually need to provide many examples to meaningfully improve the | You should avoid iterating over the same few examples multiple times, or
| system — a few hundred is a good start, although more is better. You | the model is likely to "forget" how to annotate other examples. If you
| should avoid iterating over the same few examples multiple times, or the
| model is likely to "forget" how to annotate other examples. If you
| iterate over the same few examples, you're effectively changing the loss | iterate over the same few examples, you're effectively changing the loss
| function. The optimizer will find a way to minimize the loss on your | function. The optimizer will find a way to minimize the loss on your
| examples, without regard for the consequences on the examples it's no | examples, without regard for the consequences on the examples it's no
@ -39,6 +40,8 @@ p
+h(2, "example") Example +h(2, "example") Example
+under-construction
+code. +code.
import random import random
from spacy.lang.en import English from spacy.lang.en import English