//- 💫 DOCS > API > TOP-LEVEL > COMMAND LINE INTERFACE p | As of v1.7.0, spaCy comes with new command line helpers to download and | link models and show useful debugging information. For a list of available | commands, type #[code spacy --help]. +h(3, "download") Download p | Download #[+a("/usage/models") models] for spaCy. The downloader finds the | best-matching compatible version, uses pip to download the model as a | package and automatically creates a | #[+a("/usage/models#usage") shortcut link] to load the model by name. | Direct downloads don't perform any compatibility checks and require the | model name to be specified with its version (e.g., #[code en_core_web_sm-1.2.0]). +code(false, "bash", "$"). spacy download [model] [--direct] +table(["Argument", "Type", "Description"]) +row +cell #[code model] +cell positional +cell Model name or shortcut (#[code en], #[code de], #[code vectors]). +row +cell #[code --direct], #[code -d] +cell flag +cell Force direct download of exact model version. +row +cell #[code --help], #[code -h] +cell flag +cell Show help message and available arguments. +aside("Downloading best practices") | The #[code download] command is mostly intended as a convenient, | interactive wrapper – it performs compatibility checks and prints | detailed messages in case things go wrong. It's #[strong not recommended] | to use this command as part of an automated process. If you know which | model your project needs, you should consider a | #[+a("/usage/models#download-pip") direct download via pip], or | uploading the model to a local PyPi installation and fetching it straight | from there. This will also allow you to add it as a versioned package | dependency to your project. +h(3, "link") Link p | Create a #[+a("/usage/models#usage") shortcut link] for a model, | either a Python package or a local directory. This will let you load | models from any location using a custom name via | #[+api("spacy#load") #[code spacy.load()]]. +infobox("Important note") | In spaCy v1.x, you had to use the model data directory to set up a shortcut | link for a local path. As of v2.0, spaCy expects all shortcut links to | be #[strong loadable model packages]. If you want to load a data directory, | call #[+api("spacy#load") #[code spacy.load()]] or | #[+api("language#from_disk") #[code Language.from_disk()]] with the path, | or use the #[+api("cli#package") #[code package]] command to create a | model package. +code(false, "bash", "$"). spacy link [origin] [link_name] [--force] +table(["Argument", "Type", "Description"]) +row +cell #[code origin] +cell positional +cell Model name if package, or path to local directory. +row +cell #[code link_name] +cell positional +cell Name of the shortcut link to create. +row +cell #[code --force], #[code -f] +cell flag +cell Force overwriting of existing link. +row +cell #[code --help], #[code -h] +cell flag +cell Show help message and available arguments. +h(3, "info") Info p | Print information about your spaCy installation, models and local setup, | and generate #[+a("https://en.wikipedia.org/wiki/Markdown") Markdown]-formatted | markup to copy-paste into #[+a(gh("spacy") + "/issues") GitHub issues]. +code(false, "bash"). spacy info [--markdown] spacy info [model] [--markdown] +table(["Argument", "Type", "Description"]) +row +cell #[code model] +cell positional +cell A model, i.e. shortcut link, package name or path (optional). +row +cell #[code --markdown], #[code -md] +cell flag +cell Print information as Markdown. +row +cell #[code --help], #[code -h] +cell flag +cell Show help message and available arguments. +h(3, "validate") Validate +tag-new(2) p | Find all models installed in the current environment (both packages and | shortcut links) and check whether they are compatible with the currently | installed version of spaCy. Should be run after upgrading spaCy via | #[code pip install -U spacy] to ensure that all installed models are | can be used with the new version. The command is also useful to detect | out-of-sync model links resulting from links created in different virtual | environments. Prints a list of models, the installed versions, the latest | compatible version (if out of date) and the commands for updating. +code(false, "bash", "$"). spacy validate +h(3, "convert") Convert p | Convert files into spaCy's #[+a("/api/annotation#json-input") JSON format] | for use with the #[code train] command and other experiment management | functions. The converter can be specified on the command line, or | chosen based on the file extension of the input file. +code(false, "bash", "$", false, false, true). spacy convert [input_file] [output_dir] [--converter] [--n-sents] [--morphology] +table(["Argument", "Type", "Description"]) +row +cell #[code input_file] +cell positional +cell Input file. +row +cell #[code output_dir] +cell positional +cell Output directory for converted JSON file. +row +cell #[code converter], #[code -c] +cell option +cell #[+tag-new(2)] Name of converter to use (see below). +row +cell #[code --n-sents], #[code -n] +cell option +cell Number of sentences per document. +row +cell #[code --morphology], #[code -m] +cell option +cell Enable appending morphology to tags. +row +cell #[code --help], #[code -h] +cell flag +cell Show help message and available arguments. p The following converters are available: +table(["ID", "Description"]) +row +cell #[code auto] +cell Automatically pick converter based on file extension (default). +row +cell #[code conllu], #[code conll] +cell Universal Dependencies #[code .conllu] or #[code .conll] format. +row +cell #[code ner] +cell Tab-based named entity recognition format. +row +cell #[code iob] +cell IOB named entity recognition format. +h(3, "train") Train p | Train a model. Expects data in spaCy's | #[+a("/api/annotation#json-input") JSON format]. On each epoch, a model | will be saved out to the directory. Accuracy scores and model details | will be added to a #[+a("/usage/training#models-generating") #[code meta.json]] | to allow packaging the model using the | #[+api("cli#package") #[code package]] command. +code(false, "bash", "$", false, false, true). spacy train [lang] [output_dir] [train_data] [dev_data] [--n-iter] [--n-sents] [--use-gpu] [--meta-path] [--vectors] [--no-tagger] [--no-parser] [--no-entities] [--gold-preproc] +table(["Argument", "Type", "Description"]) +row +cell #[code lang] +cell positional +cell Model language. +row +cell #[code output_dir] +cell positional +cell Directory to store model in. +row +cell #[code train_data] +cell positional +cell Location of JSON-formatted training data. +row +cell #[code dev_data] +cell positional +cell Location of JSON-formatted dev data (optional). +row +cell #[code --n-iter], #[code -n] +cell option +cell Number of iterations (default: #[code 20]). +row +cell #[code --n-sents], #[code -ns] +cell option +cell Number of sentences (default: #[code 0]). +row +cell #[code --use-gpu], #[code -g] +cell option +cell Use GPU. +row +cell #[code --vectors], #[code -v] +cell option +cell Model to load vectors from. +row +cell #[code --meta-path], #[code -m] +cell option +cell | #[+tag-new(2)] Optional path to model | #[+a("/usage/training#models-generating") #[code meta.json]]. | All relevant properties like #[code lang], #[code pipeline] and | #[code spacy_version] will be overwritten. +row +cell #[code --version], #[code -V] +cell option +cell | Model version. Will be written out to the model's | #[code meta.json] after training. +row +cell #[code --no-tagger], #[code -T] +cell flag +cell Don't train tagger. +row +cell #[code --no-parser], #[code -P] +cell flag +cell Don't train parser. +row +cell #[code --no-entities], #[code -N] +cell flag +cell Don't train NER. +row +cell #[code --gold-preproc], #[code -G] +cell flag +cell Use gold preprocessing. +row +cell #[code --help], #[code -h] +cell flag +cell Show help message and available arguments. +h(4, "train-hyperparams") Environment variables for hyperparameters +tag-new(2) p | spaCy lets you set hyperparameters for training via environment variables. | This is useful, because it keeps the command simple and allows you to | #[+a("https://askubuntu.com/questions/17536/how-do-i-create-a-permanent-bash-alias/17537#17537") create an alias] | for your custom #[code train] command while still being able to easily | tweak the hyperparameters. For example: +code(false, "bash"). parser_hidden_depth=2 parser_maxout_pieces=1 train-parser +table(["Name", "Description", "Default"]) +row +cell #[code dropout_from] +cell Initial dropout rate. +cell #[code 0.2] +row +cell #[code dropout_to] +cell Final dropout rate. +cell #[code 0.2] +row +cell #[code dropout_decay] +cell Rate of dropout change. +cell #[code 0.0] +row +cell #[code batch_from] +cell Initial batch size. +cell #[code 1] +row +cell #[code batch_to] +cell Final batch size. +cell #[code 64] +row +cell #[code batch_compound] +cell Rate of batch size acceleration. +cell #[code 1.001] +row +cell #[code token_vector_width] +cell Width of embedding tables and convolutional layers. +cell #[code 128] +row +cell #[code embed_size] +cell Number of rows in embedding tables. +cell #[code 7500] //- +row //- +cell #[code parser_maxout_pieces] //- +cell Number of pieces in the parser's and NER's first maxout layer. //- +cell #[code 2] //- +row //- +cell #[code parser_hidden_depth] //- +cell Number of hidden layers in the parser and NER. //- +cell #[code 1] +row +cell #[code hidden_width] +cell Size of the parser's and NER's hidden layers. +cell #[code 128] //- +row //- +cell #[code history_feats] //- +cell Number of previous action ID features for parser and NER. //- +cell #[code 128] //- +row //- +cell #[code history_width] //- +cell Number of embedding dimensions for each action ID. //- +cell #[code 128] +row +cell #[code learn_rate] +cell Learning rate. +cell #[code 0.001] +row +cell #[code optimizer_B1] +cell Momentum for the Adam solver. +cell #[code 0.9] +row +cell #[code optimizer_B2] +cell Adagrad-momentum for the Adam solver. +cell #[code 0.999] +row +cell #[code optimizer_eps] +cell Epsylon value for the Adam solver. +cell #[code 1e-08] +row +cell #[code L2_penalty] +cell L2 regularisation penalty. +cell #[code 1e-06] +row +cell #[code grad_norm_clip] +cell Gradient L2 norm constraint. +cell #[code 1.0] +h(3, "evaluate") Evaluate +tag-new(2) p | Evaluate a model's accuracy and speed on JSON-formatted annotated data. | Will print the results and optionally export | #[+a("/usage/visualizers") displaCy visualizations] of a sample set of | parses to #[code .html] files. Visualizations for the dependency parse | and NER will be exported as separate files if the respective component | is present in the model's pipeline. +code(false, "bash", "$", false, false, true). spacy evaluate [model] [data_path] [--displacy-path] [--displacy-limit] [--gpu-id] [--gold-preproc] +table(["Argument", "Type", "Description"]) +row +cell #[code model] +cell positional +cell | Model to evaluate. Can be a package or shortcut link name, or a | path to a model data directory. +row +cell #[code data_path] +cell positional +cell Location of JSON-formatted evaluation data. +row +cell #[code --displacy-path], #[code -dp] +cell option +cell | Directory to output rendered parses as HTML. If not set, no | visualizations will be generated. +row +cell #[code --displacy-limit], #[code -dl] +cell option +cell | Number of parses to generate per file. Defaults to #[code 25]. | Keep in mind that a significantly higher number might cause the | #[code .html] files to render slowly. +row +cell #[code --gpu-id], #[code -g] +cell option +cell GPU to use, if any. Defaults to #[code -1] for CPU. +row +cell #[code --gold-preproc], #[code -G] +cell flag +cell Use gold preprocessing. +h(3, "package") Package p | Generate a #[+a("/usage/training#models-generating") model Python package] | from an existing model data directory. All data files are copied over. | If the path to a meta.json is supplied, or a meta.json is found in the | input directory, this file is used. Otherwise, the data can be entered | directly from the command line. The required file templates are downloaded | from #[+src(gh("spacy-dev-resources", "templates/model")) GitHub] to make | sure you're always using the latest versions. This means you need to be | connected to the internet to use this command. +code(false, "bash", "$", false, false, true). spacy package [input_dir] [output_dir] [--meta-path] [--create-meta] [--force] +table(["Argument", "Type", "Description"]) +row +cell #[code input_dir] +cell positional +cell Path to directory containing model data. +row +cell #[code output_dir] +cell positional +cell Directory to create package folder in. +row +cell #[code --meta-path], #[code -m] +cell option +cell #[+tag-new(2)] Path to meta.json file (optional). +row +cell #[code --create-meta], #[code -c] +cell flag +cell | #[+tag-new(2)] Create a meta.json file on the command line, even | if one already exists in the directory. +row +cell #[code --force], #[code -f] +cell flag +cell Force overwriting of existing folder in output directory. +row +cell #[code --help], #[code -h] +cell flag +cell Show help message and available arguments.