Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							e30348b331 
							
						 
					 
					
						
						
							
							Prefer to import from symbols instead of parts_of_speech  
						
						 
						
						
						
					 
					
						2016-11-04 00:27:55 +01:00  
					
					
						 
						
							
							
							 
						
					 
				 
			
				
					
						
							
							
								 
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							f5fe4f595b 
							
						 
					 
					
						
						
							
							Fix json loading, for Python 3.  
						
						 
						
						
						
					 
					
						2016-10-20 21:23:26 +02:00  
					
					
						 
						
							
							
							 
						
					 
				 
			
				
					
						
							
							
								 
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							2e92c6fb3a 
							
						 
					 
					
						
						
							
							Fix JSON encoding issue on load  
						
						 
						
						
						
					 
					
						2016-10-20 21:06:48 +02:00  
					
					
						 
						
							
							
							 
						
					 
				 
			
				
					
						
							
							
								 
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							f189a3cb00 
							
						 
					 
					
						
						
							
							Fix encoding when opening files in Python 2.7, re Issue  #539  
						
						 
						
						
						
					 
					
						2016-10-20 14:42:56 +02:00  
					
					
						 
						
							
							
							 
						
					 
				 
			
				
					
						
							
							
								 
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							a2f3510d6d 
							
						 
					 
					
						
						
							
							Fix lemmatizer  
						
						 
						
						
						
					 
					
						2016-09-27 17:47:05 +02:00  
					
					
						 
						
							
							
							 
						
					 
				 
			
				
					
						
							
							
								 
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							35cd953f9e 
							
						 
					 
					
						
						
							
							Fix pos name conflict with morphology  
						
						 
						
						
						
					 
					
						2016-09-27 14:16:22 +02:00  
					
					
						 
						
							
							
							 
						
					 
				 
			
				
					
						
							
							
								 
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							40509e8bca 
							
						 
					 
					
						
						
							
							Tweak the new is_base_form logic, because we can expect the 'pos' key in the morphology we're passed.  
						
						 
						
						
						
					 
					
						2016-09-27 14:01:16 +02:00  
					
					
						 
						
							
							
							 
						
					 
				 
			
				
					
						
							
							
								 
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							3cb4d455d2 
							
						 
					 
					
						
						
							
							Pass lemmatizer morphological features, so that rules are sensitive to base/inflected distinction, which is how the WordNet data is designed. See Issue  #435  
						
						 
						
						
						
					 
					
						2016-09-27 13:52:11 +02:00  
					
					
						 
						
							
							
							 
						
					 
				 
			
				
					
						
							
							
								 
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							fd65cf6cbb 
							
						 
					 
					
						
						
							
							Finish refactoring data loading  
						
						 
						
						
						
					 
					
						2016-09-24 20:26:17 +02:00  
					
					
						 
						
							
							
							 
						
					 
				 
			
				
					
						
							
							
								 
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							83e364188c 
							
						 
					 
					
						
						
							
							Mostly finished loading refactoring. Design is in place, but doesn't work yet.  
						
						 
						
						
						
					 
					
						2016-09-24 15:42:01 +02:00  
					
					
						 
						
							
							
							 
						
					 
				 
			
				
					
						
							
							
								 
								Henning Peters 
							
						 
					 
					
						
						
						
						
							
						
						
							846fa49b2a 
							
						 
					 
					
						
						
							
							distinct load() and from_package() methods  
						
						 
						
						
						
					 
					
						2016-01-16 10:00:57 +01:00  
					
					
						 
						
							
							
							 
						
					 
				 
			
				
					
						
							
							
								 
								Henning Peters 
							
						 
					 
					
						
						
						
						
							
						
						
							788f734513 
							
						 
					 
					
						
						
							
							refactored data_dir->via, add zip_safe, add spacy.load()  
						
						 
						
						
						
					 
					
						2016-01-15 18:01:02 +01:00  
					
					
						 
						
							
							
							 
						
					 
				 
			
				
					
						
							
							
								 
								Henning Peters 
							
						 
					 
					
						
						
						
						
							
						
						
							bc229790ac 
							
						 
					 
					
						
						
							
							integrate with sputnik  
						
						 
						
						
						
					 
					
						2016-01-13 19:46:17 +01:00  
					
					
						 
						
							
							
							 
						
					 
				 
			
				
					
						
							
							
								 
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							eaf2ad59f1 
							
						 
					 
					
						
						
							
							* Fix use of mock Package object  
						
						 
						
						
						
					 
					
						2015-12-31 04:13:15 +01:00  
					
					
						 
						
							
							
							 
						
					 
				 
			
				
					
						
							
							
								 
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							55bcdf8bdd 
							
						 
					 
					
						
						
							
							* Fix errors  
						
						 
						
						
						
					 
					
						2015-12-29 22:32:03 +01:00  
					
					
						 
						
							
							
							 
						
					 
				 
			
				
					
						
							
							
								 
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							aec130af56 
							
						 
					 
					
						
						
							
							Use util.Package class for io  
						
						 
						
						... 
						
						
						
						Previous Sputnik integration caused API change: Vocab, Tagger, etc
were loaded via a from_package classmethod, that required a
sputnik.Package instance. This forced users to first create a
sputnik.Sputnik() instance, in order to acquire a Package via
sp.pool().
Instead I've created a small file-system shim, util.Package, which
allows classes to have a .load() classmethod, that accepts either
util.Package objects, or strings. We can later gut the internals
of this and make it a proxy for Sputnik if we need more functionality
that should live in the Sputnik library.
Sputnik is now only used to download and install the data, in
spacy.en.download 
						
					 
					
						2015-12-29 18:00:48 +01:00  
					
					
						 
						
							
							
							 
						
					 
				 
			
				
					
						
							
							
								 
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							c5902f2b4b 
							
						 
					 
					
						
						
							
							* Upd Lemmatizer to use MockPackage. Replace from_package with load() classmethod  
						
						 
						
						
						
					 
					
						2015-12-29 16:56:02 +01:00  
					
					
						 
						
							
							
							 
						
					 
				 
			
				
					
						
							
							
								 
								Henning Peters 
							
						 
					 
					
						
						
						
						
							
						
						
							8359bd4d93 
							
						 
					 
					
						
						
							
							strip data/ from package, friendlier Language invocation, make data_dir backward/forward-compatible  
						
						 
						
						
						
					 
					
						2015-12-18 09:52:55 +01:00  
					
					
						 
						
							
							
							 
						
					 
				 
			
				
					
						
							
							
								 
								Henning Peters 
							
						 
					 
					
						
						
						
						
							
						
						
							9027cef3bc 
							
						 
					 
					
						
						
							
							access model via sputnik  
						
						 
						
						
						
					 
					
						2015-12-07 06:01:28 +01:00  
					
					
						 
						
							
							
							 
						
					 
				 
			
				
					
						
							
							
								 
								maxirmx 
							
						 
					 
					
						
						
						
						
							
						
						
							f07e4accd7 
							
						 
					 
					
						
						
							
							Fixing encoding issue  #4  
						
						 
						
						
						
					 
					
						2015-10-21 20:45:56 +03:00  
					
					
						 
						
							
							
							 
						
					 
				 
			
				
					
						
							
							
								 
								maxirmx 
							
						 
					 
					
						
						
						
						
							
						
						
							fcbfff043f 
							
						 
					 
					
						
						
							
							Fixing encoding issue  #3  
						
						 
						
						
						
					 
					
						2015-10-21 15:52:34 +03:00  
					
					
						 
						
							
							
							 
						
					 
				 
			
				
					
						
							
							
								 
								maxirmx 
							
						 
					 
					
						
						
						
						
							
						
						
							fe9d2e2c4e 
							
						 
					 
					
						
						
							
							Fixing encode issue  #2  
						
						 
						
						
						
					 
					
						2015-10-21 15:36:21 +03:00  
					
					
						 
						
							
							
							 
						
					 
				 
			
				
					
						
							
							
								 
								maxirmx 
							
						 
					 
					
						
						
						
						
							
						
						
							e4a1726f77 
							
						 
					 
					
						
						
							
							Fixing encoding issue  
						
						 
						
						... 
						
						
						
						UTF-8 
						
					 
					
						2015-10-21 14:16:37 +03:00  
					
					
						 
						
							
							
							 
						
					 
				 
			
				
					
						
							
							
								 
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							5332c0b697 
							
						 
					 
					
						
						
							
							* Add support for punctuation lemmatization, to handle unicode characters. This should help in addressing Issue  #130  
						
						 
						
						
						
					 
					
						2015-10-09 18:54:40 +11:00  
					
					
						 
						
							
							
							 
						
					 
				 
			
				
					
						
							
							
								 
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							24ed3fc25c 
							
						 
					 
					
						
						
							
							* Check file existance before opening in lemmatizer  
						
						 
						
						
						
					 
					
						2015-09-13 10:45:21 +10:00  
					
					
						 
						
							
							
							 
						
					 
				 
			
				
					
						
							
							
								 
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							631c843ed1 
							
						 
					 
					
						
						
							
							* Don't look for index.adv in le,matizer  
						
						 
						
						
						
					 
					
						2015-09-12 06:03:44 +02:00  
					
					
						 
						
							
							
							 
						
					 
				 
			
				
					
						
							
							
								 
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							7c660c5efc 
							
						 
					 
					
						
						
							
							* Use dict.get in lemmatizer  
						
						 
						
						
						
					 
					
						2015-09-10 14:51:39 +02:00  
					
					
						 
						
							
							
							 
						
					 
				 
			
				
					
						
							
							
								 
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							64d71f8893 
							
						 
					 
					
						
						
							
							* Fix lemmatizer  
						
						 
						
						
						
					 
					
						2015-09-08 15:38:03 +02:00  
					
					
						 
						
							
							
							 
						
					 
				 
			
				
					
						
							
							
								 
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							f0a7c99554 
							
						 
					 
					
						
						
							
							* Relax rule-requirement in lemmatizer  
						
						 
						
						
						
					 
					
						2015-08-27 10:26:19 +02:00  
					
					
						 
						
							
							
							 
						
					 
				 
			
				
					
						
							
							
								 
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							0af139e183 
							
						 
					 
					
						
						
							
							* Tagger training now working. Still need to test load/save of model. Morphology still broken.  
						
						 
						
						
						
					 
					
						2015-08-27 09:16:11 +02:00  
					
					
						 
						
							
							
							 
						
					 
				 
			
				
					
						
							
							
								 
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							c5a27d1821 
							
						 
					 
					
						
						
							
							* Move lemmatizer to spacy  
						
						 
						
						
						
					 
					
						2015-08-25 15:47:08 +02:00  
					
					
						 
						
							
							
							 
						
					 
				 
			
				
					
						
							
							
								 
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							e1c1a4b868 
							
						 
					 
					
						
						
							
							* Tmp  
						
						 
						
						
						
					 
					
						2014-12-21 05:36:29 +11:00  
					
					
						 
						
							
							
							 
						
					 
				 
			
				
					
						
							
							
								 
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							99bbbb6feb 
							
						 
					 
					
						
						
							
							* Work on morphological processing  
						
						 
						
						
						
					 
					
						2014-12-08 21:12:15 +11:00  
					
					
						 
						
							
							
							 
						
					 
				 
			
				
					
						
							
							
								 
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							7b68f911cf 
							
						 
					 
					
						
						
							
							* Add WordNet lemmatizer  
						
						 
						
						
						
					 
					
						2014-12-08 01:39:13 +11:00