Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							62ecdea9f2 
							
						 
					 
					
						
						
							
							Add binder class for document serialization  
						
						
						
					 
					
						2017-05-09 17:21:00 +02:00 
						 
				 
			
				
					
						
							
							
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							6782eedf9b 
							
						 
					 
					
						
						
							
							Tmp GPU code  
						
						
						
					 
					
						2017-05-07 11:04:24 -05:00 
						 
				 
			
				
					
						
							
							
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							4d98511db7 
							
						 
					 
					
						
						
							
							Make Span hashable.  Closes   #1019  
						
						
						
					 
					
						2017-04-26 19:01:05 +02:00 
						 
				 
			
				
					
						
							
							
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							6a4221a6de 
							
						 
					 
					
						
						
							
							Allow lemma to be set from Python. Re  #973  
						
						
						
					 
					
						2017-04-16 18:07:53 +02:00 
						 
				 
			
				
					
						
							
							
								ines 
							
						 
					 
					
						
						
						
						
							
						
						
							0739ae7b76 
							
						 
					 
					
						
						
							
							Tidy up and fix formatting and imports  
						
						
						
					 
					
						2017-04-15 13:05:15 +02:00 
						 
				 
			
				
					
						
							
							
								ines 
							
						 
					 
					
						
						
						
						
							
						
						
							3b667a24d4 
							
						 
					 
					
						
						
							
							Remove whitespace  
						
						
						
					 
					
						2017-04-01 10:21:08 +02:00 
						 
				 
			
				
					
						
							
							
								ines 
							
						 
					 
					
						
						
						
						
							
						
						
							e71a1f4bd0 
							
						 
					 
					
						
						
							
							Fix download commands in error messages (see  #946 )  
						
						
						
					 
					
						2017-04-01 10:20:57 +02:00 
						 
				 
			
				
					
						
							
							
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							51882ee2b8 
							
						 
					 
					
						
						
							
							Fix check for setting ent_id in merge  
						
						
						
					 
					
						2017-03-31 19:32:01 +02:00 
						 
				 
			
				
					
						
							
							
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							fc3900e5b2 
							
						 
					 
					
						
						
							
							Allow ent_id to be set in Token  
						
						
						
					 
					
						2017-03-31 14:00:14 +02:00 
						 
				 
			
				
					
						
							
							
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							9720103428 
							
						 
					 
					
						
						
							
							Improve attribute handlign in doc.merge(). Still unsatisfying  
						
						
						
					 
					
						2017-03-31 13:59:58 +02:00 
						 
				 
			
				
					
						
							
							
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							0fefdfcbda 
							
						 
					 
					
						
						
							
							Merge pull request  #935  from ericzhao28/master  
						
						... 
						
						
						
						Add option to use label=ent_type in doc.merge arguments (Bug fix for issue #862 ) 
						
					 
					
						2017-03-30 02:51:24 +02:00 
						 
				 
			
				
					
						
							
							
								Eric Zhao 
							
						 
					 
					
						
						
						
						
							
						
						
							aafdf6ffb8 
							
						 
					 
					
						
						
							
							Add option to use label karg to determine ent_type in doc.merge  
						
						
						
					 
					
						2017-03-28 23:35:03 -07:00 
						 
				 
			
				
					
						
							
							
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							28bb546939 
							
						 
					 
					
						
						
							
							Merge pull request  #883  from ericzhao28/master  
						
						... 
						
						
						
						Add `lower_` and `upper_` properties to `Span` class 
						
					 
					
						2017-03-16 23:35:47 +01:00 
						 
				 
			
				
					
						
							
							
								ines 
							
						 
					 
					
						
						
						
						
							
						
						
							66c1f194f9 
							
						 
					 
					
						
						
							
							Use consistent unicode declarations  
						
						
						
					 
					
						2017-03-12 13:07:28 +01:00 
						 
				 
			
				
					
						
							
							
								Em 
							
						 
					 
					
						
						
						
						
							
						
						
							9c809efc25 
							
						 
					 
					
						
						
							
							Removed mapStr  
						
						
						
					 
					
						2017-03-11 16:23:26 -08:00 
						 
				 
			
				
					
						
							
							
								Em 
							
						 
					 
					
						
						
						
						
							
						
						
							426d17167f 
							
						 
					 
					
						
						
							
							Added string manipulation for spans  
						
						
						
					 
					
						2017-03-10 16:50:02 -08:00 
						 
				 
			
				
					
						
							
							
								Roman Inflianskas 
							
						 
					 
					
						
						
						
						
							
						
						
							66e1109b53 
							
						 
					 
					
						
						
							
							Add support for Universal Dependencies v2.0  
						
						
						
					 
					
						2017-03-03 13:17:34 +01:00 
						 
				 
			
				
					
						
							
							
								Matvey Ezhov 
							
						 
					 
					
						
						
						
						
							
						
						
							32a22291bc 
							
						 
					 
					
						
						
							
							Small Doc.count_by documentation update  
						
						... 
						
						
						
						Current example doesn't work 
						
					 
					
						2017-01-31 19:18:45 +03:00 
						 
				 
			
				
					
						
							
							
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							6c665b81df 
							
						 
					 
					
						
						
							
							Fix redundant == TAG in from_array conditional  
						
						
						
					 
					
						2017-01-31 00:46:21 +11:00 
						 
				 
			
				
					
						
							
							
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							e7f8e13cf3 
							
						 
					 
					
						
						
							
							Make Token hashable.  Fixes   #743  
						
						
						
					 
					
						2017-01-16 13:27:57 +01:00 
						 
				 
			
				
					
						
							
							
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							12cd27b821 
							
						 
					 
					
						
						
							
							Amend 8ae8b443f: Handle comparison with None tokens.  
						
						
						
					 
					
						2017-01-11 13:03:32 +01:00 
						 
				 
			
				
					
						
							
							
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							44e2b0100d 
							
						 
					 
					
						
						
							
							Support TAG attribute in doc.from_array  
						
						
						
					 
					
						2017-01-10 22:47:07 +01:00 
						 
				 
			
				
					
						
							
							
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							8ae8b443f1 
							
						 
					 
					
						
						
							
							Add richcmp method to Token.  Closes   #631  
						
						
						
					 
					
						2017-01-09 19:30:31 +01:00 
						 
				 
			
				
					
						
							
							
								kengz 
							
						 
					 
					
						
						
						
						
							
						
						
							73a38bd4d1 
							
						 
					 
					
						
						
							
							Merge remote-tracking branch 'upstream/master'  
						
						
						
					 
					
						2016-12-30 12:19:59 -05:00 
						 
				 
			
				
					
						
							
							
								kengz 
							
						 
					 
					
						
						
						
						
							
						
						
							da44183ae1 
							
						 
					 
					
						
						
							
							move parse_tree logic to a new tokens/printers.py file  
						
						
						
					 
					
						2016-12-30 12:19:18 -05:00 
						 
				 
			
				
					
						
							
							
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							404019ad2f 
							
						 
					 
					
						
						
							
							Fix issue  #672 : ent_iob_ was a string, not unicode, due to missing unicode_literals statement.  
						
						
						
					 
					
						2016-12-18 22:33:53 +01:00 
						 
				 
			
				
					
						
							
							
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							f6e356aada 
							
						 
					 
					
						
						
							
							Add (and test) Span.sentiment attribute. By default we average token.span, but can override with custom hook. Re Issue  #667  
						
						
						
					 
					
						2016-12-02 11:05:50 +01:00 
						 
				 
			
				
					
						
							
							
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							87613edf8f 
							
						 
					 
					
						
						
							
							Add set_struct_attr staticmethod to token  
						
						
						
					 
					
						2016-11-25 12:41:47 +01:00 
						 
				 
			
				
					
						
							
							
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							fb69aa648f 
							
						 
					 
					
						
						
							
							Merge branch 'master' of ssh://github.com/explosion/spaCy  
						
						
						
					 
					
						2016-11-25 11:35:44 +01:00 
						 
				 
			
				
					
						
							
							
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							9a03a3f85e 
							
						 
					 
					
						
						
							
							Add get_struct_attr staticmethod to Token, to match Lexeme.get_struct_attr.  
						
						
						
					 
					
						2016-11-25 11:35:17 +01:00 
						 
				 
			
				
					
						
							
							
								Pokey Rule 
							
						 
					 
					
						
						
						
						
							
						
						
							3e3bda142d 
							
						 
					 
					
						
						
							
							Add noun_chunks to Span  
						
						
						
					 
					
						2016-11-24 10:47:20 +00:00 
						 
				 
			
				
					
						
							
							
								tiago 
							
						 
					 
					
						
						
						
						
							
						
						
							b38cfd0ef9 
							
						 
					 
					
						
						
							
							now span.merge returns token like it says on documentation  
						
						
						
					 
					
						2016-11-09 14:58:19 +00:00 
						 
				 
			
				
					
						
							
							
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							1fb09c3dc1 
							
						 
					 
					
						
						
							
							Fix morphology tagger  
						
						
						
					 
					
						2016-11-04 19:19:09 +01:00 
						 
				 
			
				
					
						
							
							
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							293c79c09a 
							
						 
					 
					
						
						
							
							Fix   #595 : Lemmatization was incorrect for base forms, because morphological analyser wasn't adding morphology properly.  
						
						
						
					 
					
						2016-11-04 00:29:07 +01:00 
						 
				 
			
				
					
						
							
							
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							f292f7f0e6 
							
						 
					 
					
						
						
							
							Fix Issue  #599 , by considering empty documents to be parsed and tagged. Implementation is a bit dodgy.  
						
						
						
					 
					
						2016-11-02 23:48:43 +01:00 
						 
				 
			
				
					
						
							
							
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							05a8b752a2 
							
						 
					 
					
						
						
							
							Fix Issue  #600 : Missing setters for Token attribute.  
						
						
						
					 
					
						2016-11-02 23:28:59 +01:00 
						 
				 
			
				
					
						
							
							
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							11664b9f20 
							
						 
					 
					
						
						
							
							Fix variable error in token  
						
						
						
					 
					
						2016-11-01 13:28:00 +01:00 
						 
				 
			
				
					
						
							
							
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							8c4d1b46ce 
							
						 
					 
					
						
						
							
							Fix variable error in Span  
						
						
						
					 
					
						2016-11-01 13:27:44 +01:00 
						 
				 
			
				
					
						
							
							
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							e7af6b937f 
							
						 
					 
					
						
						
							
							Fix syntax error while fixing doc strings  
						
						
						
					 
					
						2016-11-01 13:27:32 +01:00 
						 
				 
			
				
					
						
							
							
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							b86f8af0c1 
							
						 
					 
					
						
						
							
							Fix doc strings  
						
						
						
					 
					
						2016-11-01 12:25:36 +01:00 
						 
				 
			
				
					
						
							
							
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							4ca31b4d87 
							
						 
					 
					
						
						
							
							Fix clobbering of 'missing' named ent values after assigning ents.  
						
						
						
					 
					
						2016-10-26 13:13:56 +02:00 
						 
				 
			
				
					
						
							
							
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							15c9b59f0e 
							
						 
					 
					
						
						
							
							Fix Issue  #461 : O tag was being clobbered by doc.ents.__set__  
						
						
						
					 
					
						2016-10-23 15:50:26 +02:00 
						 
				 
			
				
					
						
							
							
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							2c3a67b693 
							
						 
					 
					
						
						
							
							Fix calculation of vector norm, re Issue  #522 . Need to consolidate the calculations into a helper function.  
						
						
						
					 
					
						2016-10-23 14:49:31 +02:00 
						 
				 
			
				
					
						
							
							
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							e80944276f 
							
						 
					 
					
						
						
							
							Fix Span.vector_norm  
						
						
						
					 
					
						2016-10-20 21:58:56 +02:00 
						 
				 
			
				
					
						
							
							
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							3588a18fb8 
							
						 
					 
					
						
						
							
							Fix hook names in doc  
						
						
						
					 
					
						2016-10-19 21:15:16 +02:00 
						 
				 
			
				
					
						
							
							
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							5d5742b773 
							
						 
					 
					
						
						
							
							Add sentiment field to doc, rename getters_for_tokens and getters_for_spans, add user_hooks field to Doc.  
						
						
						
					 
					
						2016-10-19 20:54:22 +02:00 
						 
				 
			
				
					
						
							
							
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							9b60186266 
							
						 
					 
					
						
						
							
							Fix doc class  
						
						
						
					 
					
						2016-10-17 15:23:47 +02:00 
						 
				 
			
				
					
						
							
							
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							7fd98fc91c 
							
						 
					 
					
						
						
							
							Remove deprecation shim around str/bytes in Token.  
						
						
						
					 
					
						2016-10-17 14:02:47 +02:00 
						 
				 
			
				
					
						
							
							
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							b67697a97b 
							
						 
					 
					
						
						
							
							Improve API for doc.merge() and span.merge(), to use keyword arguments.  
						
						
						
					 
					
						2016-10-17 14:02:13 +02:00 
						 
				 
			
				
					
						
							
							
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							fbb7f3f15c 
							
						 
					 
					
						
						
							
							Add user_data attribute to Doc object.  
						
						
						
					 
					
						2016-10-17 11:43:22 +02:00 
						 
				 
			
				
					
						
							
							
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							c1abc8f6ed 
							
						 
					 
					
						
						
							
							Fix deprecation stuff in Token: Remove the shim for the str/unicode semantics, and raise for has_repvec and repvec  
						
						
						
					 
					
						2016-10-17 11:18:41 +02:00 
						 
				 
			
				
					
						
							
							
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							09ab447a18 
							
						 
					 
					
						
						
							
							Remove tensor property from token.  
						
						
						
					 
					
						2016-10-17 02:45:09 +02:00 
						 
				 
			
				
					
						
							
							
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							5d10e2005c 
							
						 
					 
					
						
						
							
							Defer some attributes to Doc, via getters_for_tokens attribute.  
						
						
						
					 
					
						2016-10-17 02:44:49 +02:00 
						 
				 
			
				
					
						
							
							
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							8829984efb 
							
						 
					 
					
						
						
							
							Remove tensor attribute from Span and Token.  
						
						
						
					 
					
						2016-10-17 02:44:04 +02:00 
						 
				 
			
				
					
						
							
							
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							d15a88c66a 
							
						 
					 
					
						
						
							
							Defer some attributes to Doc via getters_for_spans  
						
						
						
					 
					
						2016-10-17 02:43:35 +02:00 
						 
				 
			
				
					
						
							
							
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							62230dd13a 
							
						 
					 
					
						
						
							
							Add getters_for_spans and getters_for_tokens attributes to Doc. Fix docstring  
						
						
						
					 
					
						2016-10-17 02:42:51 +02:00 
						 
				 
			
				
					
						
							
							
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							ae11ea8240 
							
						 
					 
					
						
						
							
							Add getters_for_tokens and getters_for_spans attributes to Doc object.  
						
						
						
					 
					
						2016-10-17 02:42:05 +02:00 
						 
				 
			
				
					
						
							
							
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							311a985fe0 
							
						 
					 
					
						
						
							
							Add input error handling in Doc  
						
						
						
					 
					
						2016-10-16 18:16:42 +02:00 
						 
				 
			
				
					
						
							
							
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							06322ba99d 
							
						 
					 
					
						
						
							
							Add words and spaces keyword arguments to Doc.  
						
						
						
					 
					
						2016-10-16 18:13:03 +02:00 
						 
				 
			
				
					
						
							
							
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							f3be9d0a9a 
							
						 
					 
					
						
						
							
							Add tensor field to Lexeme, Token, Doc and Span, so that users have a place to hang neural network outputs  
						
						
						
					 
					
						2016-10-14 03:24:13 +02:00 
						 
				 
			
				
					
						
							
							
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							ca32a1ab01 
							
						 
					 
					
						
						
							
							Revert "Work on Issue  #285 : intern strings into document-specific pools, to address streaming data memory growth. StringStore.__getitem__ now raises KeyError when it can't find the string. Use StringStore.intern() to get the old behaviour. Still need to hunt down all uses of StringStore.__getitem__ in library and do testing, but logic looks good."  
						
						... 
						
						
						
						This reverts commit 8423e8627f 
						
					 
					
						2016-09-30 20:20:22 +02:00 
						 
				 
			
				
					
						
							
							
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							6736977d82 
							
						 
					 
					
						
						
							
							Revert "Changes to Doc and Token for new string store scheme"  
						
						... 
						
						
						
						This reverts commit 99de44d864 
						
					 
					
						2016-09-30 20:11:15 +02:00 
						 
				 
			
				
					
						
							
							
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							99de44d864 
							
						 
					 
					
						
						
							
							Changes to Doc and Token for new string store scheme  
						
						
						
					 
					
						2016-09-30 20:00:21 +02:00 
						 
				 
			
				
					
						
							
							
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							8423e8627f 
							
						 
					 
					
						
						
							
							Work on Issue  #285 : intern strings into document-specific pools, to address streaming data memory growth. StringStore.__getitem__ now raises KeyError when it can't find the string. Use StringStore.intern() to get the old behaviour. Still need to hunt down all uses of StringStore.__getitem__ in library and do testing, but logic looks good.  
						
						
						
					 
					
						2016-09-30 10:14:47 +02:00 
						 
				 
			
				
					
						
							
							
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							d3dc5718b2 
							
						 
					 
					
						
						
							
							Fix syntax error in Doc  
						
						
						
					 
					
						2016-09-28 11:39:49 +02:00 
						 
				 
			
				
					
						
							
							
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							1b520e7bab 
							
						 
					 
					
						
						
							
							Improve docstrings for Doc object  
						
						
						
					 
					
						2016-09-28 11:15:13 +02:00 
						 
				 
			
				
					
						
							
							
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							fc4a7ad794 
							
						 
					 
					
						
						
							
							Test and fix Issue  #411 : IndexError when .sents property is used on empty string.  
						
						
						
					 
					
						2016-09-27 18:49:14 +02:00 
						 
				 
			
				
					
						
							
							
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							15e42a1ba9 
							
						 
					 
					
						
						
							
							Allow entities to be set by Span, or by 4-tuple (with entity ID)  
						
						
						
					 
					
						2016-09-24 01:17:43 +02:00 
						 
				 
			
				
					
						
							
							
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							e48df859b5 
							
						 
					 
					
						
						
							
							Fix typedef import in span.pyx  
						
						
						
					 
					
						2016-09-23 16:02:28 +02:00 
						 
				 
			
				
					
						
							
							
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							4de13606fd 
							
						 
					 
					
						
						
							
							Fix token.pyx  
						
						
						
					 
					
						2016-09-23 15:07:07 +02:00 
						 
				 
			
				
					
						
							
							
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							b4de419e19 
							
						 
					 
					
						
						
							
							Import hash_t typedef in token.pyx  
						
						
						
					 
					
						2016-09-23 14:22:06 +02:00 
						 
				 
			
				
					
						
							
							
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							c1a2e96604 
							
						 
					 
					
						
						
							
							Clean up notes at end of token.pyx  
						
						
						
					 
					
						2016-09-21 20:45:51 +02:00 
						 
				 
			
				
					
						
							
							
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							58e83fe34b 
							
						 
					 
					
						
						
							
							Initial, limited support for quantified patterns in Matcher, and tracking of ent_id attribute in Token and Span. The quantifiers need a lot more testing, and there are some known problems. The main known problem is that the zero-plus and one-plus quantifiers won't work if a token can match both the quantified pattern expression AND the tail of the match.  
						
						
						
					 
					
						2016-09-21 14:54:55 +02:00 
						 
				 
			
				
					
						
							
							
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							2735b6247b 
							
						 
					 
					
						
						
							
							Fix orths_and_spaces in Doc.__init__  
						
						
						
					 
					
						2016-09-21 14:52:05 +02:00 
						 
				 
			
				
					
						
							
							
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							cdc10e9a1c 
							
						 
					 
					
						
						
							
							* Fix Issue  #375 : noun phrase iteration results in index error if noun phrases are merged during the loop. Fix by accumulating the spans inside the noun_chunks property, allowing the Span index tricks to work.  
						
						
						
					 
					
						2016-05-20 10:14:06 +02:00 
						 
				 
			
				
					
						
							
							
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							5d86c30f0b 
							
						 
					 
					
						
						
							
							* Fix Issue  #367 : Missing has_vector property on Doc and Span objects  
						
						
						
					 
					
						2016-05-09 12:36:14 +02:00 
						 
				 
			
				
					
						
							
							
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							8c0888d6cb 
							
						 
					 
					
						
						
							
							* Fix error in span.sent  
						
						
						
					 
					
						2016-05-06 00:28:05 +02:00 
						 
				 
			
				
					
						
							
							
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							26095f9722 
							
						 
					 
					
						
						
							
							* Add span.sent property, re Issue  #366  
						
						
						
					 
					
						2016-05-06 00:17:38 +02:00 
						 
				 
			
				
					
						
							
							
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							76021cb853 
							
						 
					 
					
						
						
							
							* Fix bug in Doc.text, introduced by  a862edc 
						
						
						
					 
					
						2016-05-04 11:02:16 +02:00 
						 
				 
			
				
					
						
							
							
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							29a114e645 
							
						 
					 
					
						
						
							
							* Don't assign 0-valued tags in Doc.from_array  
						
						
						
					 
					
						2016-05-02 16:07:50 +02:00 
						 
				 
			
				
					
						
							
							
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							276fbe9996 
							
						 
					 
					
						
						
							
							* Fix assignment of iterator on Doc object  
						
						
						
					 
					
						2016-05-02 15:26:24 +02:00 
						 
				 
			
				
					
						
							
							
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							508fd1f6dc 
							
						 
					 
					
						
						
							
							* Refactor noun chunk iterators, so that they're simple functions. Install the iterator when the Doc is created, but allow users to write to the noun_chunk_iterator attribute. The iterator functions accept an object and yield (int start, int end, int label) triples.  
						
						
						
					 
					
						2016-05-02 14:25:10 +02:00 
						 
				 
			
				
					
						
							
							
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							6df3858dbc 
							
						 
					 
					
						
						
							
							* Fix Issue  #323 : Incorrect semantics of Token.__str__ built-in. Add flag to allow users to switch the old semantics back on, to ease transition.  
						
						
						
					 
					
						2016-04-12 13:17:59 +10:00 
						 
				 
			
				
					
						
							
							
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							872695759d 
							
						 
					 
					
						
						
							
							Merge pull request  #306  from wbwseeker/german_noun_chunks  
						
						... 
						
						
						
						add German noun chunk functionality 
						
					 
					
						2016-04-08 00:54:24 +10:00 
						 
				 
			
				
					
						
							
							
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							26622f0ffc 
							
						 
					 
					
						
						
							
							Merge branch 'master' of ssh://github.com/honnibal/spaCy  
						
						
						
					 
					
						2016-03-29 14:31:52 +11:00 
						 
				 
			
				
					
						
							
							
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							ad119c074f 
							
						 
					 
					
						
						
							
							* Fix incorrect whitespacing in Doc.text. This change is potentially breaking, to anyone who was relying on the previous incorrect semantics.  
						
						
						
					 
					
						2016-03-29 13:02:42 +11:00 
						 
				 
			
				
					
						
							
							
								Wolfgang Seeker 
							
						 
					 
					
						
						
						
						
							
						
						
							d65ef41d08 
							
						 
					 
					
						
						
							
							make error messages language independent  
						
						
						
					 
					
						2016-03-24 11:47:09 +01:00 
						 
				 
			
				
					
						
							
							
								Wolfgang Seeker 
							
						 
					 
					
						
						
						
						
							
						
						
							5080077097 
							
						 
					 
					
						
						
							
							revert init_model.py back to pre-german state (because it makes more sense)  
						
						... 
						
						
						
						simplify token.n_rights and token.n_lefts 
						
					 
					
						2016-03-21 16:10:25 +01:00 
						 
				 
			
				
					
						
							
							
								Wolfgang Seeker 
							
						 
					 
					
						
						
						
						
							
						
						
							5e2e8e951a 
							
						 
					 
					
						
						
							
							add baseclass DocIterator for iterators over documents  
						
						... 
						
						
						
						add classes for English and German noun chunks
the respective iterators are set for the document when created by the parser
as they depend on the annotation scheme of the parsing model 
						
					 
					
						2016-03-16 15:53:35 +01:00 
						 
				 
			
				
					
						
							
							
								Wolfgang Seeker 
							
						 
					 
					
						
						
						
						
							
						
						
							2ae253ef5b 
							
						 
					 
					
						
						
							
							changed head.__set__ to make it simpler  
						
						
						
					 
					
						2016-03-14 13:43:48 +01:00 
						 
				 
			
				
					
						
							
							
								Wolfgang Seeker 
							
						 
					 
					
						
						
						
						
							
						
						
							46e3f979f1 
							
						 
					 
					
						
						
							
							add function for setting head and label to token  
						
						... 
						
						
						
						change PseudoProjectivity.deprojectivize to use these functions 
						
					 
					
						2016-03-11 17:31:06 +01:00 
						 
				 
			
				
					
						
							
							
								Wolfgang Seeker 
							
						 
					 
					
						
						
						
						
							
						
						
							03fb498dbe 
							
						 
					 
					
						
						
							
							introduce lang field for LexemeC to hold language id  
						
						... 
						
						
						
						put noun_chunk logic into iterators.py for each language separately 
						
					 
					
						2016-03-10 13:01:34 +01:00 
						 
				 
			
				
					
						
							
							
								Wolfgang Seeker 
							
						 
					 
					
						
						
						
						
							
						
						
							d9312bc9ea 
							
						 
					 
					
						
						
							
							add new files npchunks.{pyx,pxd} to hold noun phrase chunk generators  
						
						
						
					 
					
						2016-03-09 16:18:48 +01:00 
						 
				 
			
				
					
						
							
							
								Wolfgang Seeker 
							
						 
					 
					
						
						
						
						
							
						
						
							3448cb40a4 
							
						 
					 
					
						
						
							
							integrated pseudo-projective parsing into parser  
						
						... 
						
						
						
						- nonproj.pyx holds a class PseudoProjectivity which currently holds
  all functionality to implement Nivre & Nilsson 2005's pseudo-projective
  parsing using the HEAD decoration scheme
- changed lefts/rights in Token to account for possible non-projective
  structures 
						
					 
					
						2016-03-01 10:09:08 +01:00 
						 
				 
			
				
					
						
							
							
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							af8514cb0c 
							
						 
					 
					
						
						
							
							* Refine the way the is_parsed attribute is set by from_array  
						
						
						
					 
					
						2016-02-06 14:44:35 +01:00 
						 
				 
			
				
					
						
							
							
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							e66d45bf66 
							
						 
					 
					
						
						
							
							* Restore previous patch to Span.root, as it seems it wasn't the cause of the problem.  
						
						
						
					 
					
						2016-02-06 13:37:41 +01:00 
						 
				 
			
				
					
						
							
							
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							031b00cb91 
							
						 
					 
					
						
						
							
							* Fix Span.root calculation  
						
						
						
					 
					
						2016-02-05 20:12:09 +01:00 
						 
				 
			
				
					
						
							
							
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							e5c447e237 
							
						 
					 
					
						
						
							
							* Questionable fix to problem in Span.root  
						
						
						
					 
					
						2016-02-05 19:18:35 +01:00 
						 
				 
			
				
					
						
							
							
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							1ef84a0557 
							
						 
					 
					
						
						
							
							* Merge master into rethinc2  
						
						
						
					 
					
						2016-02-05 12:55:59 +01:00 
						 
				 
			
				
					
						
							
							
								Matthew Honnibal 
							
						 
					 
					
						
						
						
						
							
						
						
							6aa92b70f1 
							
						 
					 
					
						
						
							
							* Fix merge problem in span  
						
						
						
					 
					
						2016-02-05 12:46:11 +01:00