💫 Industrial-strength Natural Language Processing (NLP) in Python
Go to file
Henning Peters f3e73c4ca4 spaCy now passes all tests for visual studio
Visual Studio Express Community 2015 running on Windows Server 2012 Standard:

https://ci.spacy.io/builders/win64-0/builds/38/steps/shell_1/logs/stdio
2015-12-18 13:09:39 +01:00
appveyor@9f94a16f0e Adding submodule spaCy-appveyor-toolkit 2015-10-25 20:22:49 +03:00
bin try using system-wide headers 2015-12-13 12:51:23 +01:00
contributors Add contributor. 2015-10-07 17:55:46 -07:00
corpora/en * Add wordnet 2015-09-21 19:06:48 +10:00
examples * Fix sentence iteration bug in pos_tag example 2015-12-05 20:25:12 +01:00
lang_data * Fix lemma of let's, re Issue #177 2015-11-13 06:42:23 +11:00
services * Add displacy service 2015-10-28 17:36:11 +01:00
spacy Merge branch 'tmpdir' into headers 2015-12-18 12:25:25 +01:00
website remove tutorials index page 2015-12-03 20:10:21 +01:00
.appveyor.yml Added project dir to PYTHONPATH 2015-10-25 21:51:33 +03:00
.gitignore Added Windows file to .gitignore 2015-10-13 10:58:30 +03:00
.gitmodules Switching to henningpeters/spaCy-appveyor-toolkit 2015-10-26 00:16:35 +03:00
.travis.yml get buildbot running 2015-12-16 12:46:12 +01:00
bootstrap_python_env.sh * Add bootstrap script 2015-03-16 14:01:36 -04:00
build.py cleanup tmp dir 2015-12-18 12:26:18 +01:00
fabfile.py add website trailing-slash redirects 2015-12-16 11:05:18 +01:00
LICENSE.txt * Change from AGPL to MIT 2015-09-28 07:37:12 +10:00
MANIFEST.in new approach to dependency headers 2015-12-13 11:49:17 +01:00
pip-date.py fix weird windows ssl issues 2015-12-16 13:51:45 +01:00
README-MSVC.txt Small addition to MSVC readme 2015-10-25 23:05:11 +03:00
README.md spaCy now passes all tests for visual studio 2015-12-18 13:09:39 +01:00
requirements.txt get buildbot running 2015-12-13 14:13:46 +01:00
setup.py refactor setup.py 2015-12-13 23:39:29 +01:00
tox.ini refactor setup.py 2015-12-13 23:32:23 +01:00
venv.ps1 get buildbot running 2015-12-16 19:13:36 +01:00
venv.sh get buildbot running 2015-12-16 18:11:54 +01:00
wordnet_license.txt * Add WordNet license file 2015-02-01 16:11:53 +11:00

Travis CI status Appveyor status

spaCy: Industrial-strength NLP

spaCy is a library for advanced natural language processing in Python and Cython.

Documentation and details: http://spacy.io/

spaCy is built on the very latest research, but it isn't researchware. It was designed from day 1 to be used in real products. It's commercial open-source software, released under the MIT license.

Features

  • Labelled dependency parsing (91.8% accuracy on OntoNotes 5)
  • Named entity recognition (82.6% accuracy on OntoNotes 5)
  • Part-of-speech tagging (97.1% accuracy on OntoNotes 5)
  • Easy to use word vectors
  • All strings mapped to integer IDs
  • Export to numpy data arrays
  • Alignment maintained to original string, ensuring easy mark up calculation
  • Range of easy-to-use orthographic features.
  • No pre-processing required. spaCy takes raw text as input, warts and newlines and all.

Top Peformance

  • Fastest in the world: <50ms per document. No faster system has ever been announced.
  • Accuracy within 1% of the current state of the art on all tasks performed (parsing, named entity recognition, part-of-speech tagging). The only more accurate systems are an order of magnitude slower or more.

Supports

  • CPython 2.7
  • CPython 3.4
  • CPython 3.5
  • OSX
  • Linux
  • Cygwin
  • Visual Studio

Difficult to support:

  • PyPy 2.7
  • PyPy 3.4