Add note on requirements and preventing model re-downloads (closes #1143)

This commit is contained in:
ines 2017-07-22 15:40:12 +02:00
parent de25bad036
commit 7c4bf9994d

View File

@ -198,6 +198,37 @@ p
nlp = en_core_web_md.load()
doc = nlp(u'This is a sentence.')
+h(3, "models-download") Downloading and requiring model dependencies
p
| spaCy's built-in #[+api("cli#download") #[code download]] command
| is mostly intended as a convenient, interactive wrapper. It performs
| compatibility checks and prints detailed error messages and warnings.
| However, if you're downloading models as part of an automated build
| process, this only adds an unecessary layer of complexity. If you know
| which models your application needs, you should be specifying them directly.
+aside("Prevent re-downloading models")
| If you're installing a model from a URL, pip will usually re-download and
| re-install the package, even if you already have a matching
| version installed. To prevent this, simply add #[code #egg=] and the
| package name after the URL, e.g. #[code #egg=en_core_web_sm] or
| #[code #egg=en_core_web_sm-1.2.0]. This tells pip which package and version
| you're trying to download, and will skip the package if a matching
| installation is found.
p
| Because all models are valid Python packages, you can add them to your
| application's #[code requirements.txt]. If you're running your own
| internal PyPi installation, you can simply upload the models there. pip's
| #[+a("https://pip.pypa.io/en/latest/reference/pip_install/#requirements-file-format") requirements file format]
| supports both package names to download via a PyPi server, as well as direct
| URLs.
+code("requirements.txt", "text").
spacy>=1.8.0,<2.0.0
-e #{gh("spacy-models")}/releases/download/en_core_web_sm-1.2.0/en_core_web_sm-1.2.0.tar.gz#egg=en_core_web_sm-1.2.0
+h(2, "own-models") Using your own models
p