mirror of
				https://github.com/explosion/spaCy.git
				synced 2025-11-04 01:48:04 +03:00 
			
		
		
		
	
		
			
				
	
	
		
			29 lines
		
	
	
		
			604 B
		
	
	
	
		
			ReStructuredText
		
	
	
	
	
	
			
		
		
	
	
			29 lines
		
	
	
		
			604 B
		
	
	
	
		
			ReStructuredText
		
	
	
	
	
	
Why
 | 
						|
===
 | 
						|
 | 
						|
Benchmarks
 | 
						|
----------
 | 
						|
 | 
						|
Efficiency
 | 
						|
----------
 | 
						|
 | 
						|
+--------+-------+--------------+--------------+
 | 
						|
| System | Time	 | Words/second | Speed Factor |
 | 
						|
+--------+-------+--------------+--------------+
 | 
						|
| NLTK	 | 6m4s  | 89,000       | 1.00         |
 | 
						|
+--------+-------+--------------+--------------+
 | 
						|
| spaCy	 | 9.5s	 | 3,093,000	| 38.30        |
 | 
						|
+--------+-------+--------------+--------------+
 | 
						|
 | 
						|
 | 
						|
Accuracy
 | 
						|
--------
 | 
						|
 | 
						|
The comparison refers to 30 million words from the English Gigaword, on
 | 
						|
a Maxbook Air.  For context, calling string.split() on the data completes in
 | 
						|
about 5s.
 | 
						|
 | 
						|
Pros and Cons
 | 
						|
-------------
 | 
						|
 |