spaCy

mirror of https://github.com/explosion/spaCy.git synced 2025-11-27 13:26:07 +03:00

History

Daniël de Kok c53606d3b3 Avoid `TrainablePipe.finish_update` getting called twice during training PR #12136 fixed an issue where the tok2vec pipe was updated before gradient were accumulated. However, it introduced a new bug that cause `finish_update` to be called twice when using the training loop. This causes a fairly large slowdown. The `Language.update` method accepts the `sgd` argument for passing an optimizer. This argument has three possible values: - `Optimizer`: use the given optimizer to finish pipe updates. - `None`: use a default optimizer to finish pipe updates. - `False`: do not finish pipe updates. However, the latter option was not documented and not valid with the existing type of `sgd`. I assumed that this was a remnant of earlier spaCy versions and removed handling of `False`. However, with that change, we are passing `None` to `Language.update`. As a result, we were calling `finish_update` in both `Language.update` and in the training loop after all subbatches are processed. This change restores proper handling/use of `False`. Moreover, the role of `False` is now documented and added to the type to avoid future accidents.		2023-03-20 11:55:14 +01:00
..
converters	Rename language codes (Icelandic, multi-language) (#12149 )	2023-01-31 17:30:43 +01:00
__init__.pxd	Renaming gold & annotation_setter (#6042 )	2020-09-09 10:31:03 +02:00
__init__.py	Merge remote-tracking branch 'upstream/master' into update-v4-from-master-1	2023-01-27 08:29:09 +01:00
align.pyx	Fix alignment for 1-to-1 tokens and lowercasing (#6476 )	2020-12-08 14:25:16 +08:00
alignment_array.pxd	Alignment: use a simplified ragged type for performance (#10319 )	2022-04-01 09:02:06 +02:00
alignment_array.pyx	Backport parser/alignment optimizations from `feature/refactor-parser` (#10952 )	2022-06-24 13:39:52 +02:00
alignment.py	Alignment: use a simplified ragged type for performance (#10319 )	2022-04-01 09:02:06 +02:00
augment.py	Preserve missing entity annotation in augmenters (#11540 )	2022-09-27 10:16:51 +02:00
batchers.py	Fix batching regression (#12094 )	2023-01-18 18:28:30 +01:00
callbacks.py	Have logging calls use string formatting types (#12215 )	2023-02-02 11:15:22 +01:00
corpus.py	Have logging calls use string formatting types (#12215 )	2023-02-02 11:15:22 +01:00
example.pxd	Make a pre-check to speed up alignment cache (#6139 )	2020-09-24 18:13:39 +02:00
example.pyx	Merge the parser refactor into `v4` (#10940 )	2023-01-18 11:27:45 +01:00
gold_io.pyx	Fix is_sent_start when converting from JSON (fix #7635 ) (#7655 )	2021-04-08 18:24:52 +10:00
initialize.py	Merge branch 'master' into sync/master-into-v4	2023-03-02 16:24:15 +01:00
iob_utils.py	Preserve missing entity annotation in augmenters (#11540 )	2022-09-27 10:16:51 +02:00
loggers.py	New console logger with expanded progress tracking (#11972 )	2022-12-23 15:21:44 +01:00
loop.py	Avoid `TrainablePipe.finish_update` getting called twice during training	2023-03-20 11:55:14 +01:00
pretrain.py	Clarify how to fill in init_tok2vec after pretraining (#9639 )	2021-11-18 15:38:30 +01:00