spaCy

mirror of https://github.com/explosion/spaCy.git synced 2025-03-04 11:25:51 +03:00

History

Daniël de Kok 8a5814bf2c Add distillation loop (#12542 ) * Add distillation initialization and loop * Fix up configuration keys * Add docstring * Type annotations * init_nlp_distill -> init_nlp_student * Do not resolve dot name distill corpus in initialization (Since we don't use it.) * student: do not request use of optimizer in student pipe We apply finish up the updates once in the training loop instead. Also add the necessary logic to `Language.distill` to mirror `Language.update`. * Correctly determine sort key in subdivide_batch * Fix _distill_loop docstring wrt. stopping condition * _distill_loop: fix distill_data docstring Make similar changes in train_while_improving, since it also had incorrect types and missing type annotations. * Move `set_{gpu_allocator,seed}_from_config` to spacy.util * Update Language.update docs for the sgd argument * Type annotation Co-authored-by: Madeesh Kannan <shadeMe@users.noreply.github.com> --------- Co-authored-by: Madeesh Kannan <shadeMe@users.noreply.github.com>		2023-04-21 13:49:40 +02:00
..
converters	Rename language codes (Icelandic, multi-language) (#12149 )	2023-01-31 17:30:43 +01:00
__init__.pxd	Renaming gold & annotation_setter (#6042 )	2020-09-09 10:31:03 +02:00
__init__.py	Merge remote-tracking branch 'upstream/master' into update-v4-from-master-1	2023-01-27 08:29:09 +01:00
align.pyx	Fix alignment for 1-to-1 tokens and lowercasing (#6476 )	2020-12-08 14:25:16 +08:00
alignment_array.pxd	Alignment: use a simplified ragged type for performance (#10319 )	2022-04-01 09:02:06 +02:00
alignment_array.pyx	Backport parser/alignment optimizations from `feature/refactor-parser` (#10952 )	2022-06-24 13:39:52 +02:00
alignment.py	Alignment: use a simplified ragged type for performance (#10319 )	2022-04-01 09:02:06 +02:00
augment.py	Preserve missing entity annotation in augmenters (#11540 )	2022-09-27 10:16:51 +02:00
batchers.py	Fix batching regression (#12094 )	2023-01-18 18:28:30 +01:00
callbacks.py	Have logging calls use string formatting types (#12215 )	2023-02-02 11:15:22 +01:00
corpus.py	Have logging calls use string formatting types (#12215 )	2023-02-02 11:15:22 +01:00
example.pxd	Make a pre-check to speed up alignment cache (#6139 )	2020-09-24 18:13:39 +02:00
example.pyx	Merge the parser refactor into `v4` (#10940 )	2023-01-18 11:27:45 +01:00
gold_io.pyx	Fix is_sent_start when converting from JSON (fix #7635 ) (#7655 )	2021-04-08 18:24:52 +10:00
initialize.py	Add distillation loop (#12542 )	2023-04-21 13:49:40 +02:00
iob_utils.py	Preserve missing entity annotation in augmenters (#11540 )	2022-09-27 10:16:51 +02:00
loggers.py	New console logger with expanded progress tracking (#11972 )	2022-12-23 15:21:44 +01:00
loop.py	Add distillation loop (#12542 )	2023-04-21 13:49:40 +02:00
pretrain.py	Clarify how to fill in init_tok2vec after pretraining (#9639 )	2021-11-18 15:38:30 +01:00