spaCy/website/docs/api
Daniël de Kok b734e5314d
Avoid TrainablePipe.finish_update getting called twice during training (#12450)
* Avoid `TrainablePipe.finish_update` getting called twice during training

PR #12136 fixed an issue where the tok2vec pipe was updated before
gradient were accumulated. However, it introduced a new bug that cause
`finish_update` to be called twice when using the training loop. This
causes a fairly large slowdown.

The `Language.update` method accepts the `sgd` argument for passing an
optimizer. This argument has three possible values:

- `Optimizer`: use the given optimizer to finish pipe updates.
- `None`: use a default optimizer to finish pipe updates.
- `False`: do not finish pipe updates.

However, the latter option was not documented and not valid with the
existing type of `sgd`. I assumed that this was a remnant of earlier
spaCy versions and removed handling of `False`.

However, with that change, we are passing `None` to `Language.update`.
As a result, we were calling `finish_update` in both `Language.update`
and in the training loop after all subbatches are processed.

This change restores proper handling/use of `False`. Moreover, the role
of `False` is now documented and added to the type to avoid future
accidents.

* Fix typo

* Document defaults for `Language.update`
2023-03-30 09:30:42 +02:00
..
architectures.mdx Merge branch 'master' into sync/master-into-v4 2023-03-02 16:24:15 +01:00
attributeruler.mdx Merge branch 'copy_master' into copy_v4 2023-01-11 18:40:55 +01:00
attributes.mdx Website migration from Gatsby to Next (#12058) 2023-01-11 17:30:07 +01:00
cli.mdx Merge branch 'master' into sync/master-into-v4 2023-03-02 16:24:15 +01:00
coref.mdx Website migration from Gatsby to Next (#12058) 2023-01-11 17:30:07 +01:00
corpus.mdx Add spacy.PlainTextCorpusReader.v1 (#12122) 2023-01-26 11:33:22 +01:00
cython-classes.mdx Refactor lexeme mem passing (#12125) 2023-01-25 12:50:21 +09:00
cython-structs.mdx Website migration from Gatsby to Next (#12058) 2023-01-11 17:30:07 +01:00
cython.mdx Website migration from Gatsby to Next (#12058) 2023-01-11 17:30:07 +01:00
data-formats.mdx Website migration from Gatsby to Next (#12058) 2023-01-11 17:30:07 +01:00
dependencymatcher.mdx Add new tags in docs for #12334 (#12348) 2023-03-01 10:46:13 +01:00
dependencyparser.mdx Add Language.distill (#12116) 2023-01-30 12:44:11 +01:00
doc.mdx Return Tuple[Span] for all Doc/Span attrs that provide spans (#12288) 2023-03-01 16:00:02 +01:00
docbin.mdx Website migration from Gatsby to Next (#12058) 2023-01-11 17:30:07 +01:00
edittreelemmatizer.mdx Add Language.distill (#12116) 2023-01-30 12:44:11 +01:00
entitylinker.mdx Merge branch 'master' into sync/master-into-v4 2023-03-02 16:24:15 +01:00
entityrecognizer.mdx Add Language.distill (#12116) 2023-01-30 12:44:11 +01:00
entityruler.mdx Merge branch 'copy_master' into copy_v4 2023-01-11 18:40:55 +01:00
example.mdx Website migration from Gatsby to Next (#12058) 2023-01-11 17:30:07 +01:00
index.mdx Website migration from Gatsby to Next (#12058) 2023-01-11 17:30:07 +01:00
inmemorylookupkb.mdx Entity linking: use SpanGroup instead of Iterable[Span] for mentions (#12344) 2023-03-20 12:25:18 +01:00
kb.mdx Entity linking: use SpanGroup instead of Iterable[Span] for mentions (#12344) 2023-03-20 12:25:18 +01:00
language.mdx Avoid TrainablePipe.finish_update getting called twice during training (#12450) 2023-03-30 09:30:42 +02:00
legacy.mdx Merge the parser refactor into v4 (#10940) 2023-01-18 11:27:45 +01:00
lemmatizer.mdx Website migration from Gatsby to Next (#12058) 2023-01-11 17:30:07 +01:00
lexeme.mdx Merge branch 'copy_master' into copy_v4 2023-01-11 18:40:55 +01:00
lookups.mdx Website migration from Gatsby to Next (#12058) 2023-01-11 17:30:07 +01:00
matcher.mdx Merge branch 'copy_master' into copy_v4 2023-01-11 18:40:55 +01:00
morphologizer.mdx Cleanup/remove backwards compat overwrite settings (#11888) 2023-02-02 14:13:38 +01:00
morphology.mdx Website migration from Gatsby to Next (#12058) 2023-01-11 17:30:07 +01:00
phrasematcher.mdx Merge branch 'copy_master' into copy_v4 2023-01-11 18:40:55 +01:00
pipe.mdx Add Language.distill (#12116) 2023-01-30 12:44:11 +01:00
pipeline-functions.mdx Website migration from Gatsby to Next (#12058) 2023-01-11 17:30:07 +01:00
scorer.mdx Rename language codes (Icelandic, multi-language) (#12149) 2023-01-31 17:30:43 +01:00
sentencerecognizer.mdx Add Language.distill (#12116) 2023-01-30 12:44:11 +01:00
sentencizer.mdx Website migration from Gatsby to Next (#12058) 2023-01-11 17:30:07 +01:00
span-resolver.mdx Website migration from Gatsby to Next (#12058) 2023-01-11 17:30:07 +01:00
span.mdx Return Tuple[Span] for all Doc/Span attrs that provide spans (#12288) 2023-03-01 16:00:02 +01:00
spancategorizer.mdx Merge branch 'copy_master' into copy_v4 2023-01-11 18:40:55 +01:00
spangroup.mdx Website migration from Gatsby to Next (#12058) 2023-01-11 17:30:07 +01:00
spanruler.mdx Merge branch 'copy_master' into copy_v4 2023-01-11 18:40:55 +01:00
stringstore.mdx Add info that Vocab and StringStore are not static in docs (#12427) 2023-03-27 09:18:23 +02:00
tagger.mdx Add Language.distill (#12116) 2023-01-30 12:44:11 +01:00
textcategorizer.mdx Merge branch 'copy_master' into copy_v4 2023-01-11 18:40:55 +01:00
tok2vec.mdx Tok2Vec: Add distill method (#12108) 2023-03-09 09:37:19 +01:00
token.mdx Merge branch 'copy_master' into copy_v4 2023-01-11 18:40:55 +01:00
tokenizer.mdx Website migration from Gatsby to Next (#12058) 2023-01-11 17:30:07 +01:00
top-level.mdx Merge branch 'master' into sync/master-into-v4 2023-03-02 16:24:15 +01:00
transformer.mdx Website migration from Gatsby to Next (#12058) 2023-01-11 17:30:07 +01:00
vectors.mdx Remove names for vectors (#12243) 2023-02-08 14:37:42 +01:00
vocab.mdx Add info that Vocab and StringStore are not static in docs (#12427) 2023-03-27 09:18:23 +02:00