spaCy

mirror of https://github.com/explosion/spaCy.git synced 2026-01-02 06:53:28 +03:00

History

Adriane Boyd eed4b785f5 Load vocab lookups tables at beginning of training Similar to how vectors are handled, move the vocab lookups to be loaded at the start of training rather than when the vocab is initialized, since the vocab doesn't have access to the full config when it's created. The option moves from `nlp.load_vocab_data` to `training.lookups`. Typically these tables will come from `spacy-lookups-data`, but any `Lookups` object can be provided. The loading from `spacy-lookups-data` is now strict, so configs for each language should specify the exact tables required. This also makes it easier to control whether the larger clusters and probs tables are included. To load `lexeme_norm` from `spacy-lookups-data`: ``` [training.lookups] @misc = "spacy.LoadLookupsData.v1" lang = ${nlp.lang} tables = ["lexeme_norm"] ```		2020-09-18 15:59:16 +02:00
..
project	Fix sparse checkout and error handling	2020-09-14 14:12:58 +02:00
templates	generalize corpora, dot notation for dev and train corpus	2020-09-17 11:38:59 +02:00
__init__.py	"model" terminology consistency in docs	2020-09-03 13:13:03 +02:00
_util.py	Fix sparse checkout and error handling	2020-09-14 14:12:58 +02:00
convert.py	Renaming gold & annotation_setter (#6042 )	2020-09-09 10:31:03 +02:00
debug_config.py	Update docs links in codebase	2020-09-04 12:58:50 +02:00
debug_data.py	Renaming gold & annotation_setter (#6042 )	2020-09-09 10:31:03 +02:00
debug_model.py	Tidy up and auto-format [ci skip]	2020-09-13 10:55:36 +02:00
download.py	Update docs links in codebase	2020-09-04 12:58:50 +02:00
evaluate.py	Renaming gold & annotation_setter (#6042 )	2020-09-09 10:31:03 +02:00
info.py	Update docs links in codebase	2020-09-04 12:58:50 +02:00
init_config.py	Use consistent shortcut	2020-09-17 16:57:02 +02:00
init_model.py	Tidy up and auto-format [ci skip]	2020-09-13 10:55:36 +02:00
package.py	Support overwriting name on spacy package	2020-09-11 11:38:28 +02:00
pretrain.py	cleanup and formatting	2020-09-17 11:48:04 +02:00
profile.py	Update docs links in codebase	2020-09-04 12:58:50 +02:00
train.py	Load vocab lookups tables at beginning of training	2020-09-18 15:59:16 +02:00
validate.py	Update docs links in codebase	2020-09-04 12:58:50 +02:00