Matthew Honnibal
							
						 
					 | 
					
						
						
						
						
							
						
						
							21e90d7d0b
							
						
					 | 
					
						
						
							
							Changes to test for new string-store
						
						
						
						
						
					 | 
					
						2016-09-30 20:00:58 +02:00 | 
					
					
						
						
							
							
							
						
					 | 
				
			
				
					
						
							
							
								 
								Matthew Honnibal
							
						 
					 | 
					
						
						
						
						
							
						
						
							99de44d864
							
						
					 | 
					
						
						
							
							Changes to Doc and Token for new string store scheme
						
						
						
						
						
					 | 
					
						2016-09-30 20:00:21 +02:00 | 
					
					
						
						
							
							
							
						
					 | 
				
			
				
					
						
							
							
								 
								Matthew Honnibal
							
						 
					 | 
					
						
						
						
						
							
						
						
							78f19baafa
							
						
					 | 
					
						
						
							
							Fix report of ParserStateError
						
						
						
						
						
					 | 
					
						2016-09-30 19:59:22 +02:00 | 
					
					
						
						
							
							
							
						
					 | 
				
			
				
					
						
							
							
								 
								Matthew Honnibal
							
						 
					 | 
					
						
						
						
						
							
						
						
							0442e0ab1e
							
						
					 | 
					
						
						
							
							Changes to transition systems for new StringStore scheme
						
						
						
						
						
					 | 
					
						2016-09-30 19:58:51 +02:00 | 
					
					
						
						
							
							
							
						
					 | 
				
			
				
					
						
							
							
								 
								Matthew Honnibal
							
						 
					 | 
					
						
						
						
						
							
						
						
							22d4752d64
							
						
					 | 
					
						
						
							
							Changes to strings.pyx for new StringStore scheme
						
						
						
						
						
					 | 
					
						2016-09-30 19:58:09 +02:00 | 
					
					
						
						
							
							
							
						
					 | 
				
			
				
					
						
							
							
								 
								Matthew Honnibal
							
						 
					 | 
					
						
						
						
						
							
						
						
							4f794b215a
							
						
					 | 
					
						
						
							
							Changes to iterators.pyx for new StringStore scheme
						
						
						
						
						
					 | 
					
						2016-09-30 19:57:49 +02:00 | 
					
					
						
						
							
							
							
						
					 | 
				
			
				
					
						
							
							
								 
								Matthew Honnibal
							
						 
					 | 
					
						
						
						
						
							
						
						
							95f8cfd745
							
						
					 | 
					
						
						
							
							Changes to morphology.pyx for new StringStore scheme
						
						
						
						
						
					 | 
					
						2016-09-30 19:57:10 +02:00 | 
					
					
						
						
							
							
							
						
					 | 
				
			
				
					
						
							
							
								 
								Matthew Honnibal
							
						 
					 | 
					
						
						
						
						
							
						
						
							3ff09614e0
							
						
					 | 
					
						
						
							
							Changes to matcher.pyx for new StringStore scheme
						
						
						
						
						
					 | 
					
						2016-09-30 19:56:48 +02:00 | 
					
					
						
						
							
							
							
						
					 | 
				
			
				
					
						
							
							
								 
								Matthew Honnibal
							
						 
					 | 
					
						
						
						
						
							
						
						
							eceeaefe53
							
						
					 | 
					
						
						
							
							Fix defaults for Parser and Entity, adding a blank= argument.
						
						
						
						
						
					 | 
					
						2016-09-30 19:56:06 +02:00 | 
					
					
						
						
							
							
							
						
					 | 
				
			
				
					
						
							
							
								 
								Matthew Honnibal
							
						 
					 | 
					
						
						
						
						
							
						
						
							8423e8627f
							
						
					 | 
					
						
						
							
							Work on Issue #285: intern strings into document-specific pools, to address streaming data memory growth. StringStore.__getitem__ now raises KeyError when it can't find the string. Use StringStore.intern() to get the old behaviour. Still need to hunt down all uses of StringStore.__getitem__ in library and do testing, but logic looks good.
						
						
						
						
						
					 | 
					
						2016-09-30 10:14:47 +02:00 | 
					
					
						
						
							
							
							
						
					 | 
				
			
				
					
						
							
							
								 
								Matthew Honnibal
							
						 
					 | 
					
						
						
						
						
							
						
						
							d3dc5718b2
							
						
					 | 
					
						
						
							
							Fix syntax error in Doc
						
						
						
						
						
					 | 
					
						2016-09-28 11:39:49 +02:00 | 
					
					
						
						
							
							
							
						
					 | 
				
			
				
					
						
							
							
								 
								Matthew Honnibal
							
						 
					 | 
					
						
						
						
						
							
						
						
							1b520e7bab
							
						
					 | 
					
						
						
							
							Improve docstrings for Doc object
						
						
						
						
						
					 | 
					
						2016-09-28 11:15:13 +02:00 | 
					
					
						
						
							
							
							
						
					 | 
				
			
				
					
						
							
							
								 
								Matthew Honnibal
							
						 
					 | 
					
						
						
						
						
							
						
						
							81a47c01d8
							
						
					 | 
					
						
						
							
							Fix test for empty sentence string.
						
						
						
						
						
					 | 
					
						2016-09-27 19:21:22 +02:00 | 
					
					
						
						
							
							
							
						
					 | 
				
			
				
					
						
							
							
								 
								Matthew Honnibal
							
						 
					 | 
					
						
						
						
						
							
						
						
							4cbf0d3bb6
							
						
					 | 
					
						
						
							
							Handle errors when no valid actions are available, pointing users to the issue tracker.
						
						
						
						
						
					 | 
					
						2016-09-27 19:19:53 +02:00 | 
					
					
						
						
							
							
							
						
					 | 
				
			
				
					
						
							
							
								 
								Matthew Honnibal
							
						 
					 | 
					
						
						
						
						
							
						
						
							430473bd98
							
						
					 | 
					
						
						
							
							Raise errors when no actions are available, re Issue #429
						
						
						
						
						
					 | 
					
						2016-09-27 19:09:37 +02:00 | 
					
					
						
						
							
							
							
						
					 | 
				
			
				
					
						
							
							
								 
								Matthew Honnibal
							
						 
					 | 
					
						
						
						
						
							
						
						
							fc4a7ad794
							
						
					 | 
					
						
						
							
							Test and fix Issue #411: IndexError when .sents property is used on empty string.
						
						
						
						
						
					 | 
					
						2016-09-27 18:49:14 +02:00 | 
					
					
						
						
							
							
							
						
					 | 
				
			
				
					
						
							
							
								 
								Matthew Honnibal
							
						 
					 | 
					
						
						
						
						
							
						
						
							3d370b7d45
							
						
					 | 
					
						
						
							
							Add test for Issue #445, fixed in 3cb4d455d, with improved lemmatizer logic
						
						
						
						
						
					 | 
					
						2016-09-27 18:39:46 +02:00 | 
					
					
						
						
							
							
							
						
					 | 
				
			
				
					
						
							
							
								 
								Matthew Honnibal
							
						 
					 | 
					
						
						
						
						
							
						
						
							a2f3510d6d
							
						
					 | 
					
						
						
							
							Fix lemmatizer
						
						
						
						
						
					 | 
					
						2016-09-27 17:47:05 +02:00 | 
					
					
						
						
							
							
							
						
					 | 
				
			
				
					
						
							
							
								 
								Matthew Honnibal
							
						 
					 | 
					
						
						
						
						
							
						
						
							07776d8096
							
						
					 | 
					
						
						
							
							Fix pos name conflict in lemmatize
						
						
						
						
						
					 | 
					
						2016-09-27 17:35:58 +02:00 | 
					
					
						
						
							
							
							
						
					 | 
				
			
				
					
						
							
							
								 
								Matthew Honnibal
							
						 
					 | 
					
						
						
						
						
							
						
						
							35cd953f9e
							
						
					 | 
					
						
						
							
							Fix pos name conflict with morphology
						
						
						
						
						
					 | 
					
						2016-09-27 14:16:22 +02:00 | 
					
					
						
						
							
							
							
						
					 | 
				
			
				
					
						
							
							
								 
								Matthew Honnibal
							
						 
					 | 
					
						
						
						
						
							
						
						
							8e7df3c4ca
							
						
					 | 
					
						
						
							
							Expect the parser data, if parser.load() is called.
						
						
						
						
						
					 | 
					
						2016-09-27 14:02:12 +02:00 | 
					
					
						
						
							
							
							
						
					 | 
				
			
				
					
						
							
							
								 
								Matthew Honnibal
							
						 
					 | 
					
						
						
						
						
							
						
						
							bb4f201ad2
							
						
					 | 
					
						
						
							
							Pass morphological features from tag map into the lemmatizer.
						
						
						
						
						
					 | 
					
						2016-09-27 14:01:43 +02:00 | 
					
					
						
						
							
							
							
						
					 | 
				
			
				
					
						
							
							
								 
								Matthew Honnibal
							
						 
					 | 
					
						
						
						
						
							
						
						
							40509e8bca
							
						
					 | 
					
						
						
							
							Tweak the new is_base_form logic, because we can expect the 'pos' key in the morphology we're passed.
						
						
						
						
						
					 | 
					
						2016-09-27 14:01:16 +02:00 | 
					
					
						
						
							
							
							
						
					 | 
				
			
				
					
						
							
							
								 
								Matthew Honnibal
							
						 
					 | 
					
						
						
						
						
							
						
						
							9c8ac91d72
							
						
					 | 
					
						
						
							
							Add test for Issue #435
						
						
						
						
						
					 | 
					
						2016-09-27 13:52:38 +02:00 | 
					
					
						
						
							
							
							
						
					 | 
				
			
				
					
						
							
							
								 
								Matthew Honnibal
							
						 
					 | 
					
						
						
						
						
							
						
						
							3cb4d455d2
							
						
					 | 
					
						
						
							
							Pass lemmatizer morphological features, so that rules are sensitive to base/inflected distinction, which is how the WordNet data is designed. See Issue #435
						
						
						
						
						
					 | 
					
						2016-09-27 13:52:11 +02:00 | 
					
					
						
						
							
							
							
						
					 | 
				
			
				
					
						
							
							
								 
								Matthew Honnibal
							
						 
					 | 
					
						
						
						
						
							
						
						
							e233328d38
							
						
					 | 
					
						
						
							
							Fix Issue #371: Lexeme objects were unhashable.
						
						
						
						
						
					 | 
					
						2016-09-27 13:22:30 +02:00 | 
					
					
						
						
							
							
							
						
					 | 
				
			
				
					
						
							
							
								 
								Matthew Honnibal
							
						 
					 | 
					
						
						
						
						
							
						
						
							e382e48d9f
							
						
					 | 
					
						
						
							
							Temporarily patch handling of defaul templates for tagger. Need to move these to language_data.
						
						
						
						
						
					 | 
					
						2016-09-27 13:21:28 +02:00 | 
					
					
						
						
							
							
							
						
					 | 
				
			
				
					
						
							
							
								 
								Matthew Honnibal
							
						 
					 | 
					
						
						
						
						
							
						
						
							a44763af0e
							
						
					 | 
					
						
						
							
							Fix Issue #469: Incorrectly cased root label in noun chunk iterator
						
						
						
						
						
					 | 
					
						2016-09-27 13:13:01 +02:00 | 
					
					
						
						
							
							
							
						
					 | 
				
			
				
					
						
							
							
								 
								Matthew Honnibal
							
						 
					 | 
					
						
						
						
						
							
						
						
							b14b9b096b
							
						
					 | 
					
						
						
							
							Return None if /deps directory not present, instead of trying to load the parser.
						
						
						
						
						
					 | 
					
						2016-09-26 18:48:03 +02:00 | 
					
					
						
						
							
							
							
						
					 | 
				
			
				
					
						
							
							
								 
								Matthew Honnibal
							
						 
					 | 
					
						
						
						
						
							
						
						
							e07b9665f7
							
						
					 | 
					
						
						
							
							Don't expect parser model
						
						
						
						
						
					 | 
					
						2016-09-26 18:09:33 +02:00 | 
					
					
						
						
							
							
							
						
					 | 
				
			
				
					
						
							
							
								 
								Matthew Honnibal
							
						 
					 | 
					
						
						
						
						
							
						
						
							ee6fa106da
							
						
					 | 
					
						
						
							
							Fix parser features
						
						
						
						
						
					 | 
					
						2016-09-26 17:57:32 +02:00 | 
					
					
						
						
							
							
							
						
					 | 
				
			
				
					
						
							
							
								 
								Matthew Honnibal
							
						 
					 | 
					
						
						
						
						
							
						
						
							e607e4b598
							
						
					 | 
					
						
						
							
							Fix parser loading
						
						
						
						
						
					 | 
					
						2016-09-26 17:51:11 +02:00 | 
					
					
						
						
							
							
							
						
					 | 
				
			
				
					
						
							
							
								 
								Matthew Honnibal
							
						 
					 | 
					
						
						
						
						
							
						
						
							0b2d7ae9d6
							
						
					 | 
					
						
						
							
							Fix Entity creation
						
						
						
						
						
					 | 
					
						2016-09-26 15:41:22 +02:00 | 
					
					
						
						
							
							
							
						
					 | 
				
			
				
					
						
							
							
								 
								Matthew Honnibal
							
						 
					 | 
					
						
						
						
						
							
						
						
							2debc4e0a2
							
						
					 | 
					
						
						
							
							Add .blank() method to Parser. Start housing default dep labels and entity types within the Defaults class.
						
						
						
						
						
					 | 
					
						2016-09-26 11:57:54 +02:00 | 
					
					
						
						
							
							
							
						
					 | 
				
			
				
					
						
							
							
								 
								Matthew Honnibal
							
						 
					 | 
					
						
						
						
						
							
						
						
							722199acb8
							
						
					 | 
					
						
						
							
							Add spacy.blank() method, that doesn't load data. Don't try to load data if path is falsey
						
						
						
						
						
					 | 
					
						2016-09-26 11:07:46 +02:00 | 
					
					
						
						
							
							
							
						
					 | 
				
			
				
					
						
							
							
								 
								Matthew Honnibal
							
						 
					 | 
					
						
						
						
						
							
						
						
							e56653f848
							
						
					 | 
					
						
						
							
							Add language data for German
						
						
						
						
						
					 | 
					
						2016-09-25 15:44:45 +02:00 | 
					
					
						
						
							
							
							
						
					 | 
				
			
				
					
						
							
							
								 
								Matthew Honnibal
							
						 
					 | 
					
						
						
						
						
							
						
						
							7db956133e
							
						
					 | 
					
						
						
							
							Move tokenizer data for German into spacy.de.language_data
						
						
						
						
						
					 | 
					
						2016-09-25 15:37:33 +02:00 | 
					
					
						
						
							
							
							
						
					 | 
				
			
				
					
						
							
							
								 
								Matthew Honnibal
							
						 
					 | 
					
						
						
						
						
							
						
						
							95aaea0d3f
							
						
					 | 
					
						
						
							
							Refactor so that the tokenizer data is read from Python data, rather than from disk
						
						
						
						
						
					 | 
					
						2016-09-25 14:49:53 +02:00 | 
					
					
						
						
							
							
							
						
					 | 
				
			
				
					
						
							
							
								 
								Matthew Honnibal
							
						 
					 | 
					
						
						
						
						
							
						
						
							d7e9acdcdf
							
						
					 | 
					
						
						
							
							Add English language data, so that the tokenizer doesn't require the data download
						
						
						
						
						
					 | 
					
						2016-09-25 14:49:00 +02:00 | 
					
					
						
						
							
							
							
						
					 | 
				
			
				
					
						
							
							
								 
								Matthew Honnibal
							
						 
					 | 
					
						
						
						
						
							
						
						
							82b8cc5efb
							
						
					 | 
					
						
						
							
							Whitespace
						
						
						
						
						
					 | 
					
						2016-09-24 22:17:01 +02:00 | 
					
					
						
						
							
							
							
						
					 | 
				
			
				
					
						
							
							
								 
								Matthew Honnibal
							
						 
					 | 
					
						
						
						
						
							
						
						
							fd58f7655a
							
						
					 | 
					
						
						
							
							Python 3 compatible basestring
						
						
						
						
						
					 | 
					
						2016-09-24 22:16:43 +02:00 | 
					
					
						
						
							
							
							
						
					 | 
				
			
				
					
						
							
							
								 
								Matthew Honnibal
							
						 
					 | 
					
						
						
						
						
							
						
						
							082e95b19e
							
						
					 | 
					
						
						
							
							Python 3 compatible basestring
						
						
						
						
						
					 | 
					
						2016-09-24 22:09:21 +02:00 | 
					
					
						
						
							
							
							
						
					 | 
				
			
				
					
						
							
							
								 
								Matthew Honnibal
							
						 
					 | 
					
						
						
						
						
							
						
						
							f19af6cb2c
							
						
					 | 
					
						
						
							
							Python 3 compatible basestring
						
						
						
						
						
					 | 
					
						2016-09-24 22:08:43 +02:00 | 
					
					
						
						
							
							
							
						
					 | 
				
			
				
					
						
							
							
								 
								Matthew Honnibal
							
						 
					 | 
					
						
						
						
						
							
						
						
							3ed4cdfe32
							
						
					 | 
					
						
						
							
							Handle pathlib.Path objects in CFile
						
						
						
						
						
					 | 
					
						2016-09-24 22:01:46 +02:00 | 
					
					
						
						
							
							
							
						
					 | 
				
			
				
					
						
							
							
								 
								Matthew Honnibal
							
						 
					 | 
					
						
						
						
						
							
						
						
							df88690177
							
						
					 | 
					
						
						
							
							Fix encoding of path variable
						
						
						
						
						
					 | 
					
						2016-09-24 21:13:15 +02:00 | 
					
					
						
						
							
							
							
						
					 | 
				
			
				
					
						
							
							
								 
								Matthew Honnibal
							
						 
					 | 
					
						
						
						
						
							
						
						
							af847e07fc
							
						
					 | 
					
						
						
							
							Fix usage of pathlib for Python3 -- turning paths to strings.
						
						
						
						
						
					 | 
					
						2016-09-24 21:05:27 +02:00 | 
					
					
						
						
							
							
							
						
					 | 
				
			
				
					
						
							
							
								 
								Matthew Honnibal
							
						 
					 | 
					
						
						
						
						
							
						
						
							453683aaf0
							
						
					 | 
					
						
						
							
							Fix spacy/vocab.pyx
						
						
						
						
						
					 | 
					
						2016-09-24 20:50:31 +02:00 | 
					
					
						
						
							
							
							
						
					 | 
				
			
				
					
						
							
							
								 
								Matthew Honnibal
							
						 
					 | 
					
						
						
						
						
							
						
						
							fd65cf6cbb
							
						
					 | 
					
						
						
							
							Finish refactoring data loading
						
						
						
						
						
					 | 
					
						2016-09-24 20:26:17 +02:00 | 
					
					
						
						
							
							
							
						
					 | 
				
			
				
					
						
							
							
								 
								Matthew Honnibal
							
						 
					 | 
					
						
						
						
						
							
						
						
							83e364188c
							
						
					 | 
					
						
						
							
							Mostly finished loading refactoring. Design is in place, but doesn't work yet.
						
						
						
						
						
					 | 
					
						2016-09-24 15:42:01 +02:00 | 
					
					
						
						
							
							
							
						
					 | 
				
			
				
					
						
							
							
								 
								Matthew Honnibal
							
						 
					 | 
					
						
						
						
						
							
						
						
							9dc8043a7e
							
						
					 | 
					
						
						
							
							Refactor Language to use new Defaults class, and work on revised data loading. We're getting rid of sputnik's weird file-system wrapper, and using pathlib.
						
						
						
						
						
					 | 
					
						2016-09-24 14:08:53 +02:00 | 
					
					
						
						
							
							
							
						
					 |