mirror of
https://github.com/explosion/spaCy.git
synced 2024-11-10 19:57:17 +03:00
💫 Use README.md instead of README.rst (#2968)
* Auto-format setup.py * Use README.md instead of README.rst
This commit is contained in:
parent
41c6002fd8
commit
3832c8a2c1
|
@ -1,4 +1,4 @@
|
|||
recursive-include include *.h
|
||||
include LICENSE
|
||||
include README.rst
|
||||
include README.md
|
||||
include bin/spacy
|
||||
|
|
278
README.md
Normal file
278
README.md
Normal file
|
@ -0,0 +1,278 @@
|
|||
# spaCy: Industrial-strength NLP
|
||||
|
||||
spaCy is a library for advanced Natural Language Processing in Python and
|
||||
Cython. It's built on the very latest research, and was designed from day one
|
||||
to be used in real products. spaCy comes with
|
||||
[pre-trained statistical models](https://spacy.io/models) and word vectors, and
|
||||
currently supports tokenization for **30+ languages**. It features the
|
||||
**fastest syntactic parser** in the world, convolutional
|
||||
**neural network models** for tagging, parsing and **named entity recognition**
|
||||
and easy **deep learning** integration. It's commercial open-source software,
|
||||
released under the MIT license.
|
||||
|
||||
💫 **Version 2.1 out now!** [Check out the release notes here.](https://github.com/explosion/spaCy/releases)
|
||||
|
||||
[![Travis Build Status](https://img.shields.io/travis/explosion/spaCy/master.svg?style=flat-square&logo=travis)](https://travis-ci.org/explosion/spaCy)
|
||||
[![Appveyor Build Status](https://img.shields.io/appveyor/ci/explosion/spaCy/master.svg?style=flat-square&logo=appveyor)](https://ci.appveyor.com/project/explosion/spaCy)
|
||||
[![Current Release Version](https://img.shields.io/github/release/explosion/spacy.svg?style=flat-square)](https://github.com/explosion/spaCy/releases)
|
||||
[![pypi Version](https://img.shields.io/pypi/v/spacy.svg?style=flat-square)](https://pypi.python.org/pypi/spacy)
|
||||
[![conda Version](https://img.shields.io/conda/vn/conda-forge/spacy.svg?style=flat-square)](https://anaconda.org/conda-forge/spacy)
|
||||
[![Python wheels](https://img.shields.io/badge/wheels-%E2%9C%93-4c1.svg?longCache=true&style=flat-square&logo=python&logoColor=white)](https://github.com/explosion/wheelwright/releases)
|
||||
[![spaCy on Twitter](https://img.shields.io/twitter/follow/spacy_io.svg?style=social&label=Follow)](https://twitter.com/spacy_io)
|
||||
|
||||
## 📖 Documentation
|
||||
|
||||
| Documentation | |
|
||||
| --- | --- |
|
||||
| [spaCy 101] | New to spaCy? Here's everything you need to know!
|
||||
| [Usage Guides] | How to use spaCy and its features. |
|
||||
| [New in v2.0] | New features, backwards incompatibilities and migration guide. |
|
||||
| [API Reference] | The detailed reference for spaCy's API. |
|
||||
| [Models] | Download statistical language models for spaCy. |
|
||||
| [Universe] | Libraries, extensions, demos, books and courses. |
|
||||
| [Changelog] | Changes and version history. |
|
||||
| [Contribute] | How to contribute to the spaCy project and code base. |
|
||||
|
||||
[spaCy 101]: https://spacy.io/usage/spacy-101
|
||||
[New in v2.0]: https://spacy.io/usage/v2#migrating
|
||||
[Usage Guides]: https://spacy.io/usage/
|
||||
[API Reference]: https://spacy.io/api/
|
||||
[Models]: https://spacy.io/models
|
||||
[Universe]: https://spacy.io/universe
|
||||
[Changelog]: https://spacy.io/usage/#changelog
|
||||
[Contribute]: https://github.com/explosion/spaCy/blob/master/CONTRIBUTING.md
|
||||
|
||||
## 💬 Where to ask questions
|
||||
|
||||
The spaCy project is maintained by [@honnibal](https://github.com/honnibal)
|
||||
and [@ines](https://github.com/ines). Please understand that we won't be able
|
||||
to provide individual support via email. We also believe that help is much more
|
||||
valuable if it's shared publicly, so that more people can benefit from it.
|
||||
|
||||
* **Bug Reports**: [GitHub Issue Tracker]
|
||||
* **Usage Questions**: [Stack Overflow] · [Gitter Chat] · [Reddit User Group]
|
||||
* **General Discussion**: [Gitter Chat] · [Reddit User Group]
|
||||
|
||||
[GitHub Issue Tracker]: https://github.com/explosion/spaCy/issues
|
||||
[Stack Overflow]: http://stackoverflow.com/questions/tagged/spacy
|
||||
[Gitter Chat]: https://gitter.im/explosion/spaCy
|
||||
[Reddit User Group]: https://www.reddit.com/r/spacynlp
|
||||
|
||||
## Features
|
||||
|
||||
* **Fastest syntactic parser** in the world
|
||||
* **Named entity** recognition
|
||||
* Non-destructive **tokenization**
|
||||
* Support for **30+ languages**
|
||||
* Pre-trained [statistical models](https://spacy.io/models) and word vectors
|
||||
* Easy **deep learning** integration
|
||||
* Part-of-speech tagging
|
||||
* Labelled dependency parsing
|
||||
* Syntax-driven sentence segmentation
|
||||
* Built in **visualizers** for syntax and NER
|
||||
* Convenient string-to-hash mapping
|
||||
* Export to numpy data arrays
|
||||
* Efficient binary serialization
|
||||
* Easy **model packaging** and deployment
|
||||
* State-of-the-art speed
|
||||
* Robust, rigorously evaluated accuracy
|
||||
|
||||
📖 **For more details, see the
|
||||
[facts, figures and benchmarks](https://spacy.io/usage/facts-figures).**
|
||||
|
||||
## Install spaCy
|
||||
|
||||
For detailed installation instructions, see the
|
||||
[documentation](https://spacy.io/usage).
|
||||
|
||||
* **Operating system**: macOS / OS X · Linux · Windows (Cygwin, MinGW, Visual Studio)
|
||||
* **Python version**: Python 2.7, 3.4+ (only 64 bit)
|
||||
* **Package managers**: [pip] · [conda] (via `conda-forge`)
|
||||
|
||||
[pip]: https://pypi.python.org/pypi/spacy
|
||||
[conda]: https://anaconda.org/conda-forge/spacy
|
||||
|
||||
### pip
|
||||
|
||||
Using pip, spaCy releases are available as source packages and binary wheels
|
||||
(as of `v2.0.13`).
|
||||
|
||||
```bash
|
||||
pip install spacy
|
||||
```
|
||||
|
||||
When using pip it is generally recommended to install packages in a virtual
|
||||
environment to avoid modifying system state:
|
||||
|
||||
```bash
|
||||
python -m venv .env
|
||||
source .env/bin/activate
|
||||
pip install spacy
|
||||
```
|
||||
|
||||
### conda
|
||||
|
||||
Thanks to our great community, we've finally re-added conda support. You can now
|
||||
install spaCy via `conda-forge`:
|
||||
|
||||
```bash
|
||||
conda config --add channels conda-forge
|
||||
conda install spacy
|
||||
```
|
||||
|
||||
For the feedstock including the build recipe and configuration,
|
||||
check out [this repository](https://github.com/conda-forge/spacy-feedstock).
|
||||
Improvements and pull requests to the recipe and setup are always appreciated.
|
||||
|
||||
### Updating spaCy
|
||||
|
||||
Some updates to spaCy may require downloading new statistical models. If you're
|
||||
running spaCy v2.0 or higher, you can use the `validate` command to check if
|
||||
your installed models are compatible and if not, print details on how to update
|
||||
them:
|
||||
|
||||
```bash
|
||||
pip install -U spacy
|
||||
python -m spacy validate
|
||||
```
|
||||
|
||||
If you've trained your own models, keep in mind that your training and runtime
|
||||
inputs must match. After updating spaCy, we recommend **retraining your models**
|
||||
with the new version.
|
||||
|
||||
📖 **For details on upgrading from spaCy 1.x to spaCy 2.x, see the
|
||||
[migration guide](https://spacy.io/usage/v2#migrating).**
|
||||
|
||||
## Download models
|
||||
|
||||
As of v1.7.0, models for spaCy can be installed as **Python packages**.
|
||||
This means that they're a component of your application, just like any
|
||||
other module. Models can be installed using spaCy's `download` command,
|
||||
or manually by pointing pip to a path or URL.
|
||||
|
||||
| Documentation | |
|
||||
| --- | --- |
|
||||
| [Available Models] | Detailed model descriptions, accuracy figures and benchmarks. |
|
||||
| [Models Documentation] | Detailed usage instructions. |
|
||||
|
||||
[Available Models]: https://spacy.io/models
|
||||
[Models Documentation]: https://spacy.io/docs/usage/models
|
||||
|
||||
```bash
|
||||
# out-of-the-box: download best-matching default model
|
||||
python -m spacy download en
|
||||
|
||||
# download best-matching version of specific model for your spaCy installation
|
||||
python -m spacy download en_core_web_lg
|
||||
|
||||
# pip install .tar.gz archive from path or URL
|
||||
pip install /Users/you/en_core_web_sm-2.0.0.tar.gz
|
||||
```
|
||||
|
||||
### Loading and using models
|
||||
|
||||
To load a model, use `spacy.load()` with the model's shortcut link:
|
||||
|
||||
```python
|
||||
import spacy
|
||||
nlp = spacy.load('en')
|
||||
doc = nlp(u'This is a sentence.')
|
||||
```
|
||||
|
||||
If you've installed a model via pip, you can also `import` it directly and
|
||||
then call its `load()` method:
|
||||
|
||||
```python
|
||||
import spacy
|
||||
import en_core_web_sm
|
||||
|
||||
nlp = en_core_web_sm.load()
|
||||
doc = nlp(u'This is a sentence.')
|
||||
```
|
||||
|
||||
📖 **For more info and examples, check out the
|
||||
[models documentation](https://spacy.io/docs/usage/models).**
|
||||
|
||||
### Support for older versions
|
||||
|
||||
If you're using an older version (`v1.6.0` or below), you can still download
|
||||
and install the old models from within spaCy using `python -m spacy.en.download all`
|
||||
or `python -m spacy.de.download all`. The `.tar.gz` archives are also
|
||||
[attached to the v1.6.0 release](https://github.com/explosion/spaCy/tree/v1.6.0).
|
||||
To download and install the models manually, unpack the archive, drop the
|
||||
contained directory into `spacy/data` and load the model via `spacy.load('en')`
|
||||
or `spacy.load('de')`.
|
||||
|
||||
## Compile from source
|
||||
|
||||
The other way to install spaCy is to clone its
|
||||
[GitHub repository](https://github.com/explosion/spaCy) and build it from
|
||||
source. That is the common way if you want to make changes to the code base.
|
||||
You'll need to make sure that you have a development environment consisting of a
|
||||
Python distribution including header files, a compiler,
|
||||
[pip](https://pip.pypa.io/en/latest/installing/),
|
||||
[virtualenv](https://virtualenv.pypa.io/) and [git](https://git-scm.com)
|
||||
installed. The compiler part is the trickiest. How to do that depends on your
|
||||
system. See notes on Ubuntu, OS X and Windows for details.
|
||||
|
||||
```bash
|
||||
# make sure you are using the latest pip
|
||||
python -m pip install -U pip
|
||||
git clone https://github.com/explosion/spaCy
|
||||
cd spaCy
|
||||
|
||||
python -m venv .env
|
||||
source .env/bin/activate
|
||||
export PYTHONPATH=`pwd`
|
||||
pip install -r requirements.txt
|
||||
python setup.py build_ext --inplace
|
||||
```
|
||||
|
||||
Compared to regular install via pip, [requirements.txt](requirements.txt)
|
||||
additionally installs developer dependencies such as Cython. For more details
|
||||
and instructions, see the documentation on
|
||||
[compiling spaCy from source](https://spacy.io/usage/#source) and the
|
||||
[quickstart widget](https://spacy.io/usage/#section-quickstart) to get
|
||||
the right commands for your platform and Python version.
|
||||
|
||||
### Ubuntu
|
||||
|
||||
Install system-level dependencies via `apt-get`:
|
||||
|
||||
```bash
|
||||
sudo apt-get install build-essential python-dev git
|
||||
```
|
||||
|
||||
### macOS / OS X
|
||||
|
||||
Install a recent version of [XCode](https://developer.apple.com/xcode/),
|
||||
including the so-called "Command Line Tools". macOS and OS X ship with Python
|
||||
and git preinstalled.
|
||||
|
||||
### Windows
|
||||
|
||||
Install a version of the [Visual C++ Build Tools](https://visualstudio.microsoft.com/visual-cpp-build-tools/) or
|
||||
[Visual Studio Express](https://www.visualstudio.com/vs/visual-studio-express/)
|
||||
that matches the version that was used to compile your Python
|
||||
interpreter. For official distributions these are VS 2008 (Python 2.7),
|
||||
VS 2010 (Python 3.4) and VS 2015 (Python 3.5).
|
||||
|
||||
## Run tests
|
||||
|
||||
spaCy comes with an [extensive test suite](spacy/tests). In order to run the
|
||||
tests, you'll usually want to clone the repository and build spaCy from source.
|
||||
This will also install the required development dependencies and test utilities
|
||||
defined in the `requirements.txt`.
|
||||
|
||||
Alternatively, you can find out where spaCy is installed and run `pytest` on
|
||||
that directory. Don't forget to also install the test utilities via spaCy's
|
||||
`requirements.txt`:
|
||||
|
||||
```bash
|
||||
python -c "import os; import spacy; print(os.path.dirname(spacy.__file__))"
|
||||
pip install -r path/to/requirements.txt
|
||||
python -m pytest <spacy-directory>
|
||||
```
|
||||
|
||||
See [the documentation](https://spacy.io/usage/#tests) for more details and
|
||||
examples.
|
332
README.rst
332
README.rst
|
@ -1,332 +0,0 @@
|
|||
spaCy: Industrial-strength NLP
|
||||
******************************
|
||||
|
||||
spaCy is a library for advanced Natural Language Processing in Python and Cython.
|
||||
It's built on the very latest research, and was designed from day one to be
|
||||
used in real products. spaCy comes with
|
||||
`pre-trained statistical models <https://spacy.io/models>`_ and word
|
||||
vectors, and currently supports tokenization for **20+ languages**. It features
|
||||
the **fastest syntactic parser** in the world, convolutional **neural network models**
|
||||
for tagging, parsing and **named entity recognition** and easy **deep learning**
|
||||
integration. It's commercial open-source software, released under the MIT license.
|
||||
|
||||
💫 **Version 2.0 out now!** `Check out the new features here. <https://spacy.io/usage/v2>`_
|
||||
|
||||
.. image:: https://img.shields.io/travis/explosion/spaCy/master.svg?style=flat-square&logo=travis
|
||||
:target: https://travis-ci.org/explosion/spaCy
|
||||
:alt: Build Status
|
||||
|
||||
.. image:: https://img.shields.io/appveyor/ci/explosion/spaCy/master.svg?style=flat-square&logo=appveyor
|
||||
:target: https://ci.appveyor.com/project/explosion/spaCy
|
||||
:alt: Appveyor Build Status
|
||||
|
||||
.. image:: https://img.shields.io/github/release/explosion/spacy.svg?style=flat-square
|
||||
:target: https://github.com/explosion/spaCy/releases
|
||||
:alt: Current Release Version
|
||||
|
||||
.. image:: https://img.shields.io/pypi/v/spacy.svg?style=flat-square
|
||||
:target: https://pypi.python.org/pypi/spacy
|
||||
:alt: pypi Version
|
||||
|
||||
.. image:: https://img.shields.io/conda/vn/conda-forge/spacy.svg?style=flat-square
|
||||
:target: https://anaconda.org/conda-forge/spacy
|
||||
:alt: conda Version
|
||||
|
||||
.. image:: https://img.shields.io/badge/chat-join%20%E2%86%92-09a3d5.svg?style=flat-square&logo=gitter-white
|
||||
:target: https://gitter.im/explosion/spaCy
|
||||
:alt: spaCy on Gitter
|
||||
|
||||
.. image:: https://img.shields.io/twitter/follow/spacy_io.svg?style=social&label=Follow
|
||||
:target: https://twitter.com/spacy_io
|
||||
:alt: spaCy on Twitter
|
||||
|
||||
📖 Documentation
|
||||
================
|
||||
|
||||
=================== ===
|
||||
`spaCy 101`_ New to spaCy? Here's everything you need to know!
|
||||
`Usage Guides`_ How to use spaCy and its features.
|
||||
`New in v2.0`_ New features, backwards incompatibilities and migration guide.
|
||||
`API Reference`_ The detailed reference for spaCy's API.
|
||||
`Models`_ Download statistical language models for spaCy.
|
||||
`Universe`_ Libraries, extensions, demos, books and courses.
|
||||
`Changelog`_ Changes and version history.
|
||||
`Contribute`_ How to contribute to the spaCy project and code base.
|
||||
=================== ===
|
||||
|
||||
.. _spaCy 101: https://spacy.io/usage/spacy-101
|
||||
.. _New in v2.0: https://spacy.io/usage/v2#migrating
|
||||
.. _Usage Guides: https://spacy.io/usage/
|
||||
.. _API Reference: https://spacy.io/api/
|
||||
.. _Models: https://spacy.io/models
|
||||
.. _Universe: https://spacy.io/universe
|
||||
.. _Changelog: https://spacy.io/usage/#changelog
|
||||
.. _Contribute: https://github.com/explosion/spaCy/blob/master/CONTRIBUTING.md
|
||||
|
||||
💬 Where to ask questions
|
||||
==========================
|
||||
|
||||
The spaCy project is maintained by `@honnibal <https://github.com/honnibal>`_
|
||||
and `@ines <https://github.com/ines>`_. Please understand that we won't be able
|
||||
to provide individual support via email. We also believe that help is much more
|
||||
valuable if it's shared publicly, so that more people can benefit from it.
|
||||
|
||||
====================== ===
|
||||
**Bug Reports** `GitHub Issue Tracker`_
|
||||
**Usage Questions** `StackOverflow`_, `Gitter Chat`_, `Reddit User Group`_
|
||||
**General Discussion** `Gitter Chat`_, `Reddit User Group`_
|
||||
====================== ===
|
||||
|
||||
.. _GitHub Issue Tracker: https://github.com/explosion/spaCy/issues
|
||||
.. _StackOverflow: http://stackoverflow.com/questions/tagged/spacy
|
||||
.. _Gitter Chat: https://gitter.im/explosion/spaCy
|
||||
.. _Reddit User Group: https://www.reddit.com/r/spacynlp
|
||||
|
||||
Features
|
||||
========
|
||||
|
||||
* **Fastest syntactic parser** in the world
|
||||
* **Named entity** recognition
|
||||
* Non-destructive **tokenization**
|
||||
* Support for **20+ languages**
|
||||
* Pre-trained `statistical models <https://spacy.io/models>`_ and word vectors
|
||||
* Easy **deep learning** integration
|
||||
* Part-of-speech tagging
|
||||
* Labelled dependency parsing
|
||||
* Syntax-driven sentence segmentation
|
||||
* Built in **visualizers** for syntax and NER
|
||||
* Convenient string-to-hash mapping
|
||||
* Export to numpy data arrays
|
||||
* Efficient binary serialization
|
||||
* Easy **model packaging** and deployment
|
||||
* State-of-the-art speed
|
||||
* Robust, rigorously evaluated accuracy
|
||||
|
||||
📖 **For more details, see the** `facts, figures and benchmarks <https://spacy.io/usage/facts-figures>`_.
|
||||
|
||||
Install spaCy
|
||||
=============
|
||||
|
||||
For detailed installation instructions, see
|
||||
the `documentation <https://spacy.io/usage>`_.
|
||||
|
||||
==================== ===
|
||||
**Operating system** macOS / OS X, Linux, Windows (Cygwin, MinGW, Visual Studio)
|
||||
**Python version** CPython 2.7, 3.4+. Only 64 bit.
|
||||
**Package managers** `pip`_ (source packages only), `conda`_ (via ``conda-forge``)
|
||||
==================== ===
|
||||
|
||||
.. _pip: https://pypi.python.org/pypi/spacy
|
||||
.. _conda: https://anaconda.org/conda-forge/spacy
|
||||
|
||||
pip
|
||||
---
|
||||
|
||||
Using pip, spaCy releases are currently only available as source packages.
|
||||
|
||||
.. code:: bash
|
||||
|
||||
pip install spacy
|
||||
|
||||
When using pip it is generally recommended to install packages in a virtual
|
||||
environment to avoid modifying system state:
|
||||
|
||||
.. code:: bash
|
||||
|
||||
python -m venv .env
|
||||
source .env/bin/activate
|
||||
pip install spacy
|
||||
|
||||
conda
|
||||
-----
|
||||
|
||||
Thanks to our great community, we've finally re-added conda support. You can now
|
||||
install spaCy via ``conda-forge``:
|
||||
|
||||
.. code:: bash
|
||||
|
||||
conda config --add channels conda-forge
|
||||
conda install spacy
|
||||
|
||||
For the feedstock including the build recipe and configuration,
|
||||
check out `this repository <https://github.com/conda-forge/spacy-feedstock>`_.
|
||||
Improvements and pull requests to the recipe and setup are always appreciated.
|
||||
|
||||
Updating spaCy
|
||||
--------------
|
||||
|
||||
Some updates to spaCy may require downloading new statistical models. If you're
|
||||
running spaCy v2.0 or higher, you can use the ``validate`` command to check if
|
||||
your installed models are compatible and if not, print details on how to update
|
||||
them:
|
||||
|
||||
.. code:: bash
|
||||
|
||||
pip install -U spacy
|
||||
python -m spacy validate
|
||||
|
||||
If you've trained your own models, keep in mind that your training and runtime
|
||||
inputs must match. After updating spaCy, we recommend **retraining your models**
|
||||
with the new version.
|
||||
|
||||
📖 **For details on upgrading from spaCy 1.x to spaCy 2.x, see the**
|
||||
`migration guide <https://spacy.io/usage/v2#migrating>`_.
|
||||
|
||||
Download models
|
||||
===============
|
||||
|
||||
As of v1.7.0, models for spaCy can be installed as **Python packages**.
|
||||
This means that they're a component of your application, just like any
|
||||
other module. Models can be installed using spaCy's ``download`` command,
|
||||
or manually by pointing pip to a path or URL.
|
||||
|
||||
======================= ===
|
||||
`Available Models`_ Detailed model descriptions, accuracy figures and benchmarks.
|
||||
`Models Documentation`_ Detailed usage instructions.
|
||||
======================= ===
|
||||
|
||||
.. _Available Models: https://spacy.io/models
|
||||
.. _Models Documentation: https://spacy.io/docs/usage/models
|
||||
|
||||
.. code:: bash
|
||||
|
||||
# out-of-the-box: download best-matching default model
|
||||
python -m spacy download en
|
||||
|
||||
# download best-matching version of specific model for your spaCy installation
|
||||
python -m spacy download en_core_web_lg
|
||||
|
||||
# pip install .tar.gz archive from path or URL
|
||||
pip install /Users/you/en_core_web_sm-2.0.0.tar.gz
|
||||
|
||||
If you have SSL certification problems, SSL customization options are described in the help:
|
||||
|
||||
# help for the download command
|
||||
python -m spacy download --help
|
||||
|
||||
Loading and using models
|
||||
------------------------
|
||||
|
||||
To load a model, use ``spacy.load()`` with the model's shortcut link:
|
||||
|
||||
.. code:: python
|
||||
|
||||
import spacy
|
||||
nlp = spacy.load('en')
|
||||
doc = nlp(u'This is a sentence.')
|
||||
|
||||
If you've installed a model via pip, you can also ``import`` it directly and
|
||||
then call its ``load()`` method:
|
||||
|
||||
.. code:: python
|
||||
|
||||
import spacy
|
||||
import en_core_web_sm
|
||||
|
||||
nlp = en_core_web_sm.load()
|
||||
doc = nlp(u'This is a sentence.')
|
||||
|
||||
📖 **For more info and examples, check out the**
|
||||
`models documentation <https://spacy.io/docs/usage/models>`_.
|
||||
|
||||
Support for older versions
|
||||
--------------------------
|
||||
|
||||
If you're using an older version (``v1.6.0`` or below), you can still download
|
||||
and install the old models from within spaCy using ``python -m spacy.en.download all``
|
||||
or ``python -m spacy.de.download all``. The ``.tar.gz`` archives are also
|
||||
`attached to the v1.6.0 release <https://github.com/explosion/spaCy/tree/v1.6.0>`_.
|
||||
To download and install the models manually, unpack the archive, drop the
|
||||
contained directory into ``spacy/data`` and load the model via ``spacy.load('en')``
|
||||
or ``spacy.load('de')``.
|
||||
|
||||
Compile from source
|
||||
===================
|
||||
|
||||
The other way to install spaCy is to clone its
|
||||
`GitHub repository <https://github.com/explosion/spaCy>`_ and build it from
|
||||
source. That is the common way if you want to make changes to the code base.
|
||||
You'll need to make sure that you have a development environment consisting of a
|
||||
Python distribution including header files, a compiler,
|
||||
`pip <https://pip.pypa.io/en/latest/installing/>`__, `virtualenv <https://virtualenv.pypa.io/>`_
|
||||
and `git <https://git-scm.com>`_ installed. The compiler part is the trickiest.
|
||||
How to do that depends on your system. See notes on Ubuntu, OS X and Windows for
|
||||
details.
|
||||
|
||||
.. code:: bash
|
||||
|
||||
# make sure you are using the latest pip
|
||||
python -m pip install -U pip
|
||||
git clone https://github.com/explosion/spaCy
|
||||
cd spaCy
|
||||
|
||||
python -m venv .env
|
||||
source .env/bin/activate
|
||||
export PYTHONPATH=`pwd`
|
||||
pip install -r requirements.txt
|
||||
python setup.py build_ext --inplace
|
||||
|
||||
Compared to regular install via pip, `requirements.txt <requirements.txt>`_
|
||||
additionally installs developer dependencies such as Cython. For more details
|
||||
and instructions, see the documentation on
|
||||
`compiling spaCy from source <https://spacy.io/usage/#source>`_ and the
|
||||
`quickstart widget <https://spacy.io/usage/#section-quickstart>`_ to get
|
||||
the right commands for your platform and Python version.
|
||||
|
||||
Instead of the above verbose commands, you can also use the following
|
||||
`Fabric <http://www.fabfile.org/>`_ commands. All commands assume that your
|
||||
virtual environment is located in a directory ``.env``. If you're using a
|
||||
different directory, you can change it via the environment variable ``VENV_DIR``,
|
||||
for example ``VENV_DIR=".custom-env" fab clean make``.
|
||||
|
||||
============= ===
|
||||
``fab env`` Create virtual environment and delete previous one, if it exists.
|
||||
``fab make`` Compile the source.
|
||||
``fab clean`` Remove compiled objects, including the generated C++.
|
||||
``fab test`` Run basic tests, aborting after first failure.
|
||||
============= ===
|
||||
|
||||
Ubuntu
|
||||
------
|
||||
|
||||
Install system-level dependencies via ``apt-get``:
|
||||
|
||||
.. code:: bash
|
||||
|
||||
sudo apt-get install build-essential python-dev git
|
||||
|
||||
macOS / OS X
|
||||
------------
|
||||
|
||||
Install a recent version of `XCode <https://developer.apple.com/xcode/>`_,
|
||||
including the so-called "Command Line Tools". macOS and OS X ship with Python
|
||||
and git preinstalled.
|
||||
|
||||
Windows
|
||||
-------
|
||||
|
||||
Install a version of `Visual Studio Express <https://www.visualstudio.com/vs/visual-studio-express/>`_
|
||||
or higher that matches the version that was used to compile your Python
|
||||
interpreter. For official distributions these are VS 2008 (Python 2.7),
|
||||
VS 2010 (Python 3.4) and VS 2015 (Python 3.5).
|
||||
|
||||
Run tests
|
||||
=========
|
||||
|
||||
spaCy comes with an `extensive test suite <spacy/tests>`_. In order to run the
|
||||
tests, you'll usually want to clone the repository and build spaCy from source.
|
||||
This will also install the required development dependencies and test utilities
|
||||
defined in the ``requirements.txt``.
|
||||
|
||||
Alternatively, you can find out where spaCy is installed and run ``pytest`` on
|
||||
that directory. Don't forget to also install the test utilities via spaCy's
|
||||
``requirements.txt``:
|
||||
|
||||
.. code:: bash
|
||||
|
||||
python -c "import os; import spacy; print(os.path.dirname(spacy.__file__))"
|
||||
pip install -r path/to/requirements.txt
|
||||
python -m pytest <spacy-directory>
|
||||
|
||||
See `the documentation <https://spacy.io/usage/#tests>`_ for more details and
|
||||
examples.
|
245
setup.py
245
setup.py
|
@ -11,72 +11,68 @@ from distutils import ccompiler, msvccompiler
|
|||
from setuptools import Extension, setup, find_packages
|
||||
|
||||
|
||||
PACKAGE_DATA = {'': ['*.pyx', '*.pxd', '*.txt', '*.tokens']}
|
||||
PACKAGE_DATA = {"": ["*.pyx", "*.pxd", "*.txt", "*.tokens"]}
|
||||
|
||||
|
||||
PACKAGES = find_packages()
|
||||
|
||||
|
||||
MOD_NAMES = [
|
||||
'spacy._align',
|
||||
'spacy.parts_of_speech',
|
||||
'spacy.strings',
|
||||
'spacy.lexeme',
|
||||
'spacy.vocab',
|
||||
'spacy.attrs',
|
||||
'spacy.morphology',
|
||||
'spacy.pipeline',
|
||||
'spacy.syntax.stateclass',
|
||||
'spacy.syntax._state',
|
||||
'spacy.tokenizer',
|
||||
'spacy.syntax.nn_parser',
|
||||
'spacy.syntax._parser_model',
|
||||
'spacy.syntax._beam_utils',
|
||||
'spacy.syntax.nonproj',
|
||||
'spacy.syntax.transition_system',
|
||||
'spacy.syntax.arc_eager',
|
||||
'spacy.gold',
|
||||
'spacy.tokens.doc',
|
||||
'spacy.tokens.span',
|
||||
'spacy.tokens.token',
|
||||
'spacy.tokens._retokenize',
|
||||
'spacy.matcher',
|
||||
'spacy.syntax.ner',
|
||||
'spacy.symbols',
|
||||
'spacy.vectors',
|
||||
"spacy._align",
|
||||
"spacy.parts_of_speech",
|
||||
"spacy.strings",
|
||||
"spacy.lexeme",
|
||||
"spacy.vocab",
|
||||
"spacy.attrs",
|
||||
"spacy.morphology",
|
||||
"spacy.pipeline",
|
||||
"spacy.syntax.stateclass",
|
||||
"spacy.syntax._state",
|
||||
"spacy.tokenizer",
|
||||
"spacy.syntax.nn_parser",
|
||||
"spacy.syntax._parser_model",
|
||||
"spacy.syntax._beam_utils",
|
||||
"spacy.syntax.nonproj",
|
||||
"spacy.syntax.transition_system",
|
||||
"spacy.syntax.arc_eager",
|
||||
"spacy.gold",
|
||||
"spacy.tokens.doc",
|
||||
"spacy.tokens.span",
|
||||
"spacy.tokens.token",
|
||||
"spacy.tokens._retokenize",
|
||||
"spacy.matcher",
|
||||
"spacy.syntax.ner",
|
||||
"spacy.symbols",
|
||||
"spacy.vectors",
|
||||
]
|
||||
|
||||
|
||||
COMPILE_OPTIONS = {
|
||||
'msvc': ['/Ox', '/EHsc'],
|
||||
'mingw32' : ['-O2', '-Wno-strict-prototypes', '-Wno-unused-function'],
|
||||
'other' : ['-O2', '-Wno-strict-prototypes', '-Wno-unused-function']
|
||||
COMPILE_OPTIONS = {
|
||||
"msvc": ["/Ox", "/EHsc"],
|
||||
"mingw32": ["-O2", "-Wno-strict-prototypes", "-Wno-unused-function"],
|
||||
"other": ["-O2", "-Wno-strict-prototypes", "-Wno-unused-function"],
|
||||
}
|
||||
|
||||
|
||||
LINK_OPTIONS = {
|
||||
'msvc' : [],
|
||||
'mingw32': [],
|
||||
'other' : []
|
||||
}
|
||||
LINK_OPTIONS = {"msvc": [], "mingw32": [], "other": []}
|
||||
|
||||
|
||||
# I don't understand this very well yet. See Issue #267
|
||||
# Fingers crossed!
|
||||
USE_OPENMP_DEFAULT = '0' if sys.platform != 'darwin' else None
|
||||
if os.environ.get('USE_OPENMP', USE_OPENMP_DEFAULT) == '1':
|
||||
if sys.platform == 'darwin':
|
||||
COMPILE_OPTIONS['other'].append('-fopenmp')
|
||||
LINK_OPTIONS['other'].append('-fopenmp')
|
||||
PACKAGE_DATA['spacy.platform.darwin.lib'] = ['*.dylib']
|
||||
PACKAGES.append('spacy.platform.darwin.lib')
|
||||
USE_OPENMP_DEFAULT = "0" if sys.platform != "darwin" else None
|
||||
if os.environ.get("USE_OPENMP", USE_OPENMP_DEFAULT) == "1":
|
||||
if sys.platform == "darwin":
|
||||
COMPILE_OPTIONS["other"].append("-fopenmp")
|
||||
LINK_OPTIONS["other"].append("-fopenmp")
|
||||
PACKAGE_DATA["spacy.platform.darwin.lib"] = ["*.dylib"]
|
||||
PACKAGES.append("spacy.platform.darwin.lib")
|
||||
|
||||
elif sys.platform == 'win32':
|
||||
COMPILE_OPTIONS['msvc'].append('/openmp')
|
||||
elif sys.platform == "win32":
|
||||
COMPILE_OPTIONS["msvc"].append("/openmp")
|
||||
|
||||
else:
|
||||
COMPILE_OPTIONS['other'].append('-fopenmp')
|
||||
LINK_OPTIONS['other'].append('-fopenmp')
|
||||
COMPILE_OPTIONS["other"].append("-fopenmp")
|
||||
LINK_OPTIONS["other"].append("-fopenmp")
|
||||
|
||||
|
||||
# By subclassing build_extensions we have the actual compiler that will be used which is really known only after finalize_options
|
||||
|
@ -85,10 +81,12 @@ class build_ext_options:
|
|||
def build_options(self):
|
||||
for e in self.extensions:
|
||||
e.extra_compile_args += COMPILE_OPTIONS.get(
|
||||
self.compiler.compiler_type, COMPILE_OPTIONS['other'])
|
||||
self.compiler.compiler_type, COMPILE_OPTIONS["other"]
|
||||
)
|
||||
for e in self.extensions:
|
||||
e.extra_link_args += LINK_OPTIONS.get(
|
||||
self.compiler.compiler_type, LINK_OPTIONS['other'])
|
||||
self.compiler.compiler_type, LINK_OPTIONS["other"]
|
||||
)
|
||||
|
||||
|
||||
class build_ext_subclass(build_ext, build_ext_options):
|
||||
|
@ -98,22 +96,23 @@ class build_ext_subclass(build_ext, build_ext_options):
|
|||
|
||||
|
||||
def generate_cython(root, source):
|
||||
print('Cythonizing sources')
|
||||
p = subprocess.call([sys.executable,
|
||||
os.path.join(root, 'bin', 'cythonize.py'),
|
||||
source], env=os.environ)
|
||||
print("Cythonizing sources")
|
||||
p = subprocess.call(
|
||||
[sys.executable, os.path.join(root, "bin", "cythonize.py"), source],
|
||||
env=os.environ,
|
||||
)
|
||||
if p != 0:
|
||||
raise RuntimeError('Running cythonize failed')
|
||||
raise RuntimeError("Running cythonize failed")
|
||||
|
||||
|
||||
def is_source_release(path):
|
||||
return os.path.exists(os.path.join(path, 'PKG-INFO'))
|
||||
return os.path.exists(os.path.join(path, "PKG-INFO"))
|
||||
|
||||
|
||||
def clean(path):
|
||||
for name in MOD_NAMES:
|
||||
name = name.replace('.', '/')
|
||||
for ext in ['.so', '.html', '.cpp', '.c']:
|
||||
name = name.replace(".", "/")
|
||||
for ext in [".so", ".html", ".cpp", ".c"]:
|
||||
file_path = os.path.join(path, name + ext)
|
||||
if os.path.exists(file_path):
|
||||
os.unlink(file_path)
|
||||
|
@ -134,100 +133,110 @@ def chdir(new_dir):
|
|||
def setup_package():
|
||||
root = os.path.abspath(os.path.dirname(__file__))
|
||||
|
||||
if len(sys.argv) > 1 and sys.argv[1] == 'clean':
|
||||
if len(sys.argv) > 1 and sys.argv[1] == "clean":
|
||||
return clean(root)
|
||||
|
||||
with chdir(root):
|
||||
with io.open(os.path.join(root, 'spacy', 'about.py'), encoding='utf8') as f:
|
||||
with io.open(os.path.join(root, "spacy", "about.py"), encoding="utf8") as f:
|
||||
about = {}
|
||||
exec(f.read(), about)
|
||||
|
||||
with io.open(os.path.join(root, 'README.rst'), encoding='utf8') as f:
|
||||
with io.open(os.path.join(root, "README.md"), encoding="utf8") as f:
|
||||
readme = f.read()
|
||||
|
||||
include_dirs = [
|
||||
get_python_inc(plat_specific=True),
|
||||
os.path.join(root, 'include')]
|
||||
os.path.join(root, "include"),
|
||||
]
|
||||
|
||||
if (ccompiler.new_compiler().compiler_type == 'msvc'
|
||||
and msvccompiler.get_build_version() == 9):
|
||||
include_dirs.append(os.path.join(root, 'include', 'msvc9'))
|
||||
if (
|
||||
ccompiler.new_compiler().compiler_type == "msvc"
|
||||
and msvccompiler.get_build_version() == 9
|
||||
):
|
||||
include_dirs.append(os.path.join(root, "include", "msvc9"))
|
||||
|
||||
ext_modules = []
|
||||
for mod_name in MOD_NAMES:
|
||||
mod_path = mod_name.replace('.', '/') + '.cpp'
|
||||
mod_path = mod_name.replace(".", "/") + ".cpp"
|
||||
extra_link_args = []
|
||||
# ???
|
||||
# Imported from patch from @mikepb
|
||||
# See Issue #267. Running blind here...
|
||||
if sys.platform == 'darwin':
|
||||
dylib_path = ['..' for _ in range(mod_name.count('.'))]
|
||||
dylib_path = '/'.join(dylib_path)
|
||||
dylib_path = '@loader_path/%s/spacy/platform/darwin/lib' % dylib_path
|
||||
extra_link_args.append('-Wl,-rpath,%s' % dylib_path)
|
||||
if sys.platform == "darwin":
|
||||
dylib_path = [".." for _ in range(mod_name.count("."))]
|
||||
dylib_path = "/".join(dylib_path)
|
||||
dylib_path = "@loader_path/%s/spacy/platform/darwin/lib" % dylib_path
|
||||
extra_link_args.append("-Wl,-rpath,%s" % dylib_path)
|
||||
ext_modules.append(
|
||||
Extension(mod_name, [mod_path],
|
||||
language='c++', include_dirs=include_dirs,
|
||||
extra_link_args=extra_link_args))
|
||||
Extension(
|
||||
mod_name,
|
||||
[mod_path],
|
||||
language="c++",
|
||||
include_dirs=include_dirs,
|
||||
extra_link_args=extra_link_args,
|
||||
)
|
||||
)
|
||||
|
||||
if not is_source_release(root):
|
||||
generate_cython(root, 'spacy')
|
||||
generate_cython(root, "spacy")
|
||||
|
||||
setup(
|
||||
name=about['__title__'],
|
||||
name=about["__title__"],
|
||||
zip_safe=False,
|
||||
packages=PACKAGES,
|
||||
package_data=PACKAGE_DATA,
|
||||
description=about['__summary__'],
|
||||
description=about["__summary__"],
|
||||
long_description=readme,
|
||||
author=about['__author__'],
|
||||
author_email=about['__email__'],
|
||||
version=about['__version__'],
|
||||
url=about['__uri__'],
|
||||
license=about['__license__'],
|
||||
long_description_content_type="text/markdown",
|
||||
author=about["__author__"],
|
||||
author_email=about["__email__"],
|
||||
version=about["__version__"],
|
||||
url=about["__uri__"],
|
||||
license=about["__license__"],
|
||||
ext_modules=ext_modules,
|
||||
scripts=['bin/spacy'],
|
||||
scripts=["bin/spacy"],
|
||||
install_requires=[
|
||||
'numpy>=1.15.0',
|
||||
'murmurhash>=0.28,<0.29',
|
||||
'cymem>=1.30,<1.32',
|
||||
'preshed>=1.0.0,<2.0.0',
|
||||
'thinc>=6.11.2,<6.12.0',
|
||||
'plac<1.0.0,>=0.9.6',
|
||||
'ujson>=1.35',
|
||||
'regex==2017.4.5',
|
||||
'dill>=0.2,<0.3',
|
||||
'requests>=2.13.0,<3.0.0',
|
||||
'pathlib==1.0.1; python_version < "3.4"'],
|
||||
setup_requires=['wheel'],
|
||||
"numpy>=1.15.0",
|
||||
"murmurhash>=0.28,<0.29",
|
||||
"cymem>=1.30,<1.32",
|
||||
"preshed>=1.0.0,<2.0.0",
|
||||
"thinc>=6.11.2,<6.12.0",
|
||||
"plac<1.0.0,>=0.9.6",
|
||||
"ujson>=1.35",
|
||||
"regex==2017.4.5",
|
||||
"dill>=0.2,<0.3",
|
||||
"requests>=2.13.0,<3.0.0",
|
||||
'pathlib==1.0.1; python_version < "3.4"',
|
||||
],
|
||||
setup_requires=["wheel"],
|
||||
extras_require={
|
||||
'cuda': ['cupy>=4.0'],
|
||||
'cuda80': ['cupy-cuda80>=4.0'],
|
||||
'cuda90': ['cupy-cuda90>=4.0'],
|
||||
'cuda91': ['cupy-cuda91>=4.0'],
|
||||
"cuda": ["cupy>=4.0"],
|
||||
"cuda80": ["cupy-cuda80>=4.0"],
|
||||
"cuda90": ["cupy-cuda90>=4.0"],
|
||||
"cuda91": ["cupy-cuda91>=4.0"],
|
||||
},
|
||||
classifiers=[
|
||||
'Development Status :: 5 - Production/Stable',
|
||||
'Environment :: Console',
|
||||
'Intended Audience :: Developers',
|
||||
'Intended Audience :: Science/Research',
|
||||
'License :: OSI Approved :: MIT License',
|
||||
'Operating System :: POSIX :: Linux',
|
||||
'Operating System :: MacOS :: MacOS X',
|
||||
'Operating System :: Microsoft :: Windows',
|
||||
'Programming Language :: Cython',
|
||||
'Programming Language :: Python :: 2',
|
||||
'Programming Language :: Python :: 2.7',
|
||||
'Programming Language :: Python :: 3',
|
||||
'Programming Language :: Python :: 3.4',
|
||||
'Programming Language :: Python :: 3.5',
|
||||
'Programming Language :: Python :: 3.6',
|
||||
'Programming Language :: Python :: 3.7',
|
||||
'Topic :: Scientific/Engineering'],
|
||||
cmdclass = {
|
||||
'build_ext': build_ext_subclass},
|
||||
"Development Status :: 5 - Production/Stable",
|
||||
"Environment :: Console",
|
||||
"Intended Audience :: Developers",
|
||||
"Intended Audience :: Science/Research",
|
||||
"License :: OSI Approved :: MIT License",
|
||||
"Operating System :: POSIX :: Linux",
|
||||
"Operating System :: MacOS :: MacOS X",
|
||||
"Operating System :: Microsoft :: Windows",
|
||||
"Programming Language :: Cython",
|
||||
"Programming Language :: Python :: 2",
|
||||
"Programming Language :: Python :: 2.7",
|
||||
"Programming Language :: Python :: 3",
|
||||
"Programming Language :: Python :: 3.4",
|
||||
"Programming Language :: Python :: 3.5",
|
||||
"Programming Language :: Python :: 3.6",
|
||||
"Programming Language :: Python :: 3.7",
|
||||
"Topic :: Scientific/Engineering",
|
||||
],
|
||||
cmdclass={"build_ext": build_ext_subclass},
|
||||
)
|
||||
|
||||
|
||||
if __name__ == '__main__':
|
||||
if __name__ == "__main__":
|
||||
setup_package()
|
||||
|
|
Loading…
Reference in New Issue
Block a user