mirror of
				https://github.com/explosion/spaCy.git
				synced 2025-10-30 23:47:31 +03:00 
			
		
		
		
	
		
			
				
	
	
		
			48 lines
		
	
	
		
			1.9 KiB
		
	
	
	
		
			ReStructuredText
		
	
	
	
	
	
			
		
		
	
	
			48 lines
		
	
	
		
			1.9 KiB
		
	
	
	
		
			ReStructuredText
		
	
	
	
	
	
| .. spaCy documentation master file, created by
 | |
|    sphinx-quickstart on Tue Aug 19 16:27:38 2014.
 | |
|    You can adapt this file completely to your liking, but it should at least
 | |
|    contain the root `toctree` directive.
 | |
| 
 | |
| spaCy NLP Tokenizer and Lexicon
 | |
| ================================
 | |
| 
 | |
| spaCy is a library for industrial strength NLP in Python and Cython.  Its core
 | |
| values are efficiency, accuracy and minimalism.  
 | |
| 
 | |
| * Efficiency: spaCy is TODOx faster than the Stanford tools, and TODOx faster
 | |
|   than NLTK.  You won't find faster NLP tools. Using spaCy will save you
 | |
|   thousands in server costs, and will force you to make fewer compromises.
 | |
| 
 | |
| * Accuracy:  All spaCy tools are within 0.5% of the current published
 | |
|   state-of-the-art, on both news and web text. NLP moves fast, so always check
 | |
|   the numbers --- and don't settle for tools that aren't backed by
 | |
|   rigorous recent evaluation. An algorithm that was "close enough to state-of-the-art"
 | |
|   5 years ago is probably crap by today's standards.
 | |
| 
 | |
| * Minimalism:  This isn't a library that covers 43 known algorithms to do X. You
 | |
|   get 1 --- the best one --- with a simple, low-level interface. This keeps the
 | |
|   code-base small and concrete.  Our Python APIs use lists and
 | |
|   dictionaries, and our C/Cython APIs use arrays and simple structs.
 | |
|   
 | |
| 
 | |
| Comparison
 | |
| ----------
 | |
| +-------------+-------------+---+-----------+--------------+
 | |
| | POS taggers | Speed (w/s) | % Acc. (news) | % Acc. (web) |
 | |
| +-------------+-------------+---------------+--------------+
 | |
| | spaCy       |             |               |              |
 | |
| +-------------+-------------+---------------+--------------+
 | |
| | Stanford    | 16,000      |               |              |
 | |
| +-------------+-------------+---------------+--------------+
 | |
| | NLTK        |             |               |              |
 | |
| +-------------+-------------+---------------+--------------+
 | |
| 
 | |
| 
 | |
| .. toctree::
 | |
|     :hidden:
 | |
|     :maxdepth: 3
 | |
|     
 | |
|     what/index.rst
 | |
|     why/index.rst
 | |
|     how/index.rst
 |