
309 lines
9.3 KiB
Raw Blame History

This file contains ambiguous Unicode characters

This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.

include ../../_includes/_mixins
| As of v1.7.0, spaCy comes with new command line helpers to download and
| link models and show useful debugging information. For a list of available
| commands, type #[code python -m spacy --help].
+aside("Why python -m?")
| The problem with a global entry point is that it's resolved by looking up
| entries in your #[code PATH] environment variable. This can give you
| unexpected results, like executing the wrong spaCy installation
| (especially when using #[code virtualenv]). #[code python -m] prevents
| fallbacks to system modules and makes sure the correct spaCy version is
| used. If you hate typing it every time, we recommend creating an
| #[code alias] instead.
+h(2, "download") Download
| Download #[+a("/docs/usage/models") models] for spaCy. The downloader finds the
| best-matching compatible version, uses pip to download the model as a
| package and automatically creates a
| #[+a("/docs/usage/models#usage") shortcut link] to load the model by name.
| Direct downloads don't perform any compatibility checks and require the
| model name to be specified with its version (e.g., #[code en_core_web_sm-1.2.0]).
+code(false, "bash").
python -m spacy download [model] [--direct]
+table(["Argument", "Type", "Description"])
+cell #[code model]
+cell positional
+cell Model name or shortcut (#[code en], #[code de], #[code vectors]).
+cell #[code --direct], #[code -d]
+cell flag
+cell Force direct download of exact model version.
+cell #[code --help], #[code -h]
+cell flag
+cell Show help message and available arguments.
+infobox("Important note")
| The #[code download] command is mostly intended as a convenient,
| interactive wrapper it performs compatibility checks and prints
| detailed messages in case things go wrong. It's #[strong not recommended]
| to use this command as part of an automated process. If you know which
| model your project needs, you should consider a
| #[+a("/docs/usage/models#download-pip") direct download via pip], or
| uploading the model to a local PyPi installation and fetching it straight
| from there. This will also allow you to add it as a versioned package
| dependency to your project.
+h(2, "link") Link
| Create a #[+a("/docs/usage/models#usage") shortcut link] for a model,
| either a Python package or a local directory. This will let you load
| models from any location using a custom name via
| #[+api("spacy#load") #[code spacy.load()]].
+code(false, "bash").
python -m spacy link [origin] [link_name] [--force]
+table(["Argument", "Type", "Description"])
+cell #[code origin]
+cell positional
+cell Model name if package, or path to local directory.
+cell #[code link_name]
+cell positional
+cell Name of the shortcut link to create.
+cell #[code --force], #[code -f]
+cell flag
+cell Force overwriting of existing link.
+cell #[code --help], #[code -h]
+cell flag
+cell Show help message and available arguments.
+h(2, "info") Info
| Print information about your spaCy installation, models and local setup,
| and generate #[+a("") Markdown]-formatted
| markup to copy-paste into #[+a(gh("spacy") + "/issues") GitHub issues].
+code(false, "bash").
python -m spacy info [--markdown]
python -m spacy info [model] [--markdown]
+table(["Argument", "Type", "Description"])
+cell #[code model]
+cell positional
+cell A model, i.e. shortcut link, package name or path (optional).
+cell #[code --markdown], #[code -md]
+cell flag
+cell Print information as Markdown.
+cell #[code --help], #[code -h]
+cell flag
+cell Show help message and available arguments.
+h(2, "convert") Convert
+tag experimental
| Convert files into spaCy's #[+a("/docs/api/annotation#json-input") JSON format]
| for use with the #[code train] command and other experiment management
| functions. The right converter is chosen based on the file extension of
| the input file. Currently only supports #[code .conllu].
+code(false, "bash").
python -m spacy convert [input_file] [output_dir] [--n-sents] [--morphology]
+table(["Argument", "Type", "Description"])
+cell #[code input_file]
+cell positional
+cell Input file.
+cell #[code output_dir]
+cell positional
+cell Output directory for converted JSON file.
+cell #[code --n-sents], #[code -n]
+cell option
+cell Number of sentences per document.
+cell #[code --morphology], #[code -m]
+cell option
+cell Enable appending morphology to tags.
+cell #[code --help], #[code -h]
+cell flag
+cell Show help message and available arguments.
+h(2, "model") Model
+tag experimental
| Initialise a new model and its data directory. For more info on this, see
| the documentation on #[+a("/docs/usage/adding-languages") adding languages].
+code(false, "bash").
python -m spacy model [lang] [model_dir] [freqs_data] [clusters_data] [vectors_data]
+table(["Argument", "Type", "Description"])
+cell #[code lang]
+cell positional
+cell Model language.
+cell #[code model_dir]
+cell positional
+cell Output directory to store the model in.
+cell #[code freqs_data]
+cell positional
+cell Tab-separated frequencies file.
+cell #[code clusters_data]
+cell positional
+cell Brown custers file (optional).
+cell #[code vectors_data]
+cell positional
+cell Word vectors file (optional).
+cell #[code --help], #[code -h]
+cell flag
+cell Show help message and available arguments.
+h(2, "train") Train
+tag experimental
| Train a model. Expects data in spaCy's
| #[+a("/docs/api/annotation#json-input") JSON format].
+code(false, "bash").
python -m spacy train [lang] [output_dir] [train_data] [dev_data] [--n-iter] [--parser-L1] [--no-tagger] [--no-parser] [--no-ner]
+table(["Argument", "Type", "Description"])
+cell #[code lang]
+cell positional
+cell Model language.
+cell #[code output_dir]
+cell positional
+cell Directory to store model in.
+cell #[code train_data]
+cell positional
+cell Location of JSON-formatted training data.
+cell #[code dev_data]
+cell positional
+cell Location of JSON-formatted dev data (optional).
+cell #[code --n-iter], #[code -n]
+cell option
+cell Number of iterations (default: #[code 15]).
+cell #[code --nsents]
+cell option
+cell Number of sentences (default: #[code 0]).
+cell #[code --parser-L1], #[code -L]
+cell option
+cell L1 regularization penalty for parser (default: #[code 0.0]).
+cell #[code --use-gpu], #[code -g]
+cell flag
+cell Use GPU.
+cell #[code --no-tagger], #[code -T]
+cell flag
+cell Don't train tagger.
+cell #[code --no-parser], #[code -P]
+cell flag
+cell Don't train parser.
+cell #[code --no-ner], #[code -N]
+cell flag
+cell Don't train NER.
+cell #[code --help], #[code -h]
+cell flag
+cell Show help message and available arguments.
+h(2, "package") Package
+tag experimental
| Generate a #[+a("/docs/usage/saving-loading#generating") model Python package]
| from an existing model data directory. All data files are copied over.
| If the path to a meta.json is supplied, or a meta.json is found in the
| input directory, this file is used. Otherwise, the data can be entered
| directly from the command line. While this feature is still experimental,
| the required file templates are downloaded from
| #[+src(gh("spacy-dev-resources", "templates/model")) GitHub]. This means
| you need to be connected to the internet to use this command.
+code(false, "bash").
python -m spacy package [input_dir] [output_dir] [--meta] [--force]
+table(["Argument", "Type", "Description"])
+cell #[code input_dir]
+cell positional
+cell Path to directory containing model data.
+cell #[code output_dir]
+cell positional
+cell Directory to create package folder in.
+cell #[code meta]
+cell option
+cell Path to meta.json file (optional).
+cell #[code --force], #[code -f]
+cell flag
+cell Force overwriting of existing folder in output directory.
+cell #[code --help], #[code -h]
+cell flag
+cell Show help message and available arguments.