Matthew Honnibal
							
						 
					 | 
					
						
						
						
						
							
						
						
							144a93c2a5
							
						
					 | 
					
						
						
							
							Back-off to tensor for similarity if no vectors
						
						
						
						
						
					 | 
					
						2017-11-03 20:56:33 +01:00 | 
					
					
						
						
							
							
							
						
					 | 
				
			
				
					
						
							
							
								 
								ines
							
						 
					 | 
					
						
						
						
						
							
						
						
							9659391944
							
						
					 | 
					
						
						
							
							Update deprecated methods and add warnings
						
						
						
						
						
					 | 
					
						2017-11-01 16:49:42 +01:00 | 
					
					
						
						
							
							
							
						
					 | 
				
			
				
					
						
							
							
								 
								Matthew Honnibal
							
						 
					 | 
					
						
						
						
						
							
						
						
							9e0ebee81c
							
						
					 | 
					
						
						
							
							Add Token.is_sent_start property, so can deprecate Token.sent_start
						
						
						
						
						
					 | 
					
						2017-11-01 13:27:14 +01:00 | 
					
					
						
						
							
							
							
						
					 | 
				
			
				
					
						
							
							
								 
								Matthew Honnibal
							
						 
					 | 
					
						
						
						
						
							
						
						
							86eba61fae
							
						
					 | 
					
						
						
							
							Fix token.vector when vectors are missing
						
						
						
						
						
					 | 
					
						2017-11-01 00:47:35 +01:00 | 
					
					
						
						
							
							
							
						
					 | 
				
			
				
					
						
							
							
								 
								ines
							
						 
					 | 
					
						
						
						
						
							
						
						
							544a407b93
							
						
					 | 
					
						
						
							
							Tidy up Doc, Token and Span and add missing docs
						
						
						
						
						
					 | 
					
						2017-10-27 17:07:26 +02:00 | 
					
					
						
						
							
							
							
						
					 | 
				
			
				
					
						
							
							
								 
								ines
							
						 
					 | 
					
						
						
						
						
							
						
						
							6a0483b7aa
							
						
					 | 
					
						
						
							
							Tidy up and document Doc, Token and Span
						
						
						
						
						
					 | 
					
						2017-10-27 15:41:45 +02:00 | 
					
					
						
						
							
							
							
						
					 | 
				
			
				
					
						
							
							
								 
								Matthew Honnibal
							
						 
					 | 
					
						
						
						
						
							
						
						
							b66b8f028b
							
						
					 | 
					
						
						
							
							Fix #1375 -- out-of-bounds on token.nbor()
						
						
						
						
						
					 | 
					
						2017-10-24 12:10:39 +02:00 | 
					
					
						
						
							
							
							
						
					 | 
				
			
				
					
						
							
							
								 
								Matthew Honnibal
							
						 
					 | 
					
						
						
						
						
							
						
						
							e0a9b02b67
							
						
					 | 
					
						
						
							
							Merge Span._ and Span.as_doc methods
						
						
						
						
						
					 | 
					
						2017-10-09 22:00:15 -05:00 | 
					
					
						
						
							
							
							
						
					 | 
				
			
				
					
						
							
							
								 
								ines
							
						 
					 | 
					
						
						
						
						
							
						
						
							3fc4fe61d2
							
						
					 | 
					
						
						
							
							Fix typo
						
						
						
						
						
					 | 
					
						2017-10-10 04:15:14 +02:00 | 
					
					
						
						
							
							
							
						
					 | 
				
			
				
					
						
							
							
								 
								Matthew Honnibal
							
						 
					 | 
					
						
						
						
						
							
						
						
							080afd4924
							
						
					 | 
					
						
						
							
							Add ternary value setting to Token.sent_start
						
						
						
						
						
					 | 
					
						2017-10-08 23:51:58 +02:00 | 
					
					
						
						
							
							
							
						
					 | 
				
			
				
					
						
							
							
								 
								Matthew Honnibal
							
						 
					 | 
					
						
						
						
						
							
						
						
							668a0ea640
							
						
					 | 
					
						
						
							
							Pass extensions into Underscore class
						
						
						
						
						
					 | 
					
						2017-10-07 18:56:01 +02:00 | 
					
					
						
						
							
							
							
						
					 | 
				
			
				
					
						
							
							
								 
								Matthew Honnibal
							
						 
					 | 
					
						
						
						
						
							
						
						
							d55d6e1cfa
							
						
					 | 
					
						
						
							
							Fix comparison of Token from different docs. Closes #1257
						
						
						
						
						
					 | 
					
						2017-08-19 16:39:32 +02:00 | 
					
					
						
						
							
							
							
						
					 | 
				
			
				
					
						
							
							
								 
								Matthew Honnibal
							
						 
					 | 
					
						
						
						
						
							
						
						
							f4662e9218
							
						
					 | 
					
						
						
							
							Fix vector linkage for token
						
						
						
						
						
					 | 
					
						2017-06-04 14:19:58 -05:00 | 
					
					
						
						
							
							
							
						
					 | 
				
			
				
					
						
							
							
								 
								Matthew Honnibal
							
						 
					 | 
					
						
						
						
						
							
						
						
							498ad85309
							
						
					 | 
					
						
						
							
							Try using tensor for vector/similarity methdos
						
						
						
						
						
					 | 
					
						2017-05-30 23:35:17 +02:00 | 
					
					
						
						
							
							
							
						
					 | 
				
			
				
					
						
							
							
								 
								Matthew Honnibal
							
						 
					 | 
					
						
						
						
						
							
						
						
							fe11564b8e
							
						
					 | 
					
						
						
							
							Finish stringstore change. Also xfail vectors tests
						
						
						
						
						
					 | 
					
						2017-05-28 15:10:22 +02:00 | 
					
					
						
						
							
							
							
						
					 | 
				
			
				
					
						
							
							
								 
								Matthew Honnibal
							
						 
					 | 
					
						
						
						
						
							
						
						
							2445707f3c
							
						
					 | 
					
						
						
							
							Re-delegate vectors to vocab
						
						
						
						
						
					 | 
					
						2017-05-28 11:46:10 +02:00 | 
					
					
						
						
							
							
							
						
					 | 
				
			
				
					
						
							
							
								 
								Matthew Honnibal
							
						 
					 | 
					
						
						
						
						
							
						
						
							01e59e4e6e
							
						
					 | 
					
						
						
							
							* Add Token.sent_start property, re Issue #235
						
						
						
						
						
					 | 
					
						2017-05-23 18:41:11 +02:00 | 
					
					
						
						
							
							
							
						
					 | 
				
			
				
					
						
							
							
								 
								ines
							
						 
					 | 
					
						
						
						
						
							
						
						
							7ed8a92ed1
							
						
					 | 
					
						
						
							
							Update docstrings and API docs for Token
						
						
						
						
						
					 | 
					
						2017-05-20 15:13:33 +02:00 | 
					
					
						
						
							
							
							
						
					 | 
				
			
				
					
						
							
							
								 
								ines
							
						 
					 | 
					
						
						
						
						
							
						
						
							a804045597
							
						
					 | 
					
						
						
							
							Use is_ancestor instead of deprecated is_ancestor_of
						
						
						
						
						
					 | 
					
						2017-05-19 20:23:40 +02:00 | 
					
					
						
						
							
							
							
						
					 | 
				
			
				
					
						
							
							
								 
								ines
							
						 
					 | 
					
						
						
						
						
							
						
						
							e9e62b01b0
							
						
					 | 
					
						
						
							
							Update docstrings and API docs for Token
						
						
						
						
						
					 | 
					
						2017-05-19 18:47:56 +02:00 | 
					
					
						
						
							
							
							
						
					 | 
				
			
				
					
						
							
							
								 
								ines
							
						 
					 | 
					
						
						
						
						
							
						
						
							9d85cda8e4
							
						
					 | 
					
						
						
							
							Fix models error message and use about.__docs_models__ (see #1051)
						
						
						
						
						
					 | 
					
						2017-05-13 13:05:47 +02:00 | 
					
					
						
						
							
							
							
						
					 | 
				
			
				
					
						
							
							
								 
								ines
							
						 
					 | 
					
						
						
						
						
							
						
						
							6b942763f0
							
						
					 | 
					
						
						
							
							Tidy up imports
						
						
						
						
						
					 | 
					
						2017-05-13 13:04:40 +02:00 | 
					
					
						
						
							
							
							
						
					 | 
				
			
				
					
						
							
							
								 
								Matthew Honnibal
							
						 
					 | 
					
						
						
						
						
							
						
						
							6a4221a6de
							
						
					 | 
					
						
						
							
							Allow lemma to be set from Python. Re #973
						
						
						
						
						
					 | 
					
						2017-04-16 18:07:53 +02:00 | 
					
					
						
						
							
							
							
						
					 | 
				
			
				
					
						
							
							
								 
								ines
							
						 
					 | 
					
						
						
						
						
							
						
						
							0739ae7b76
							
						
					 | 
					
						
						
							
							Tidy up and fix formatting and imports
						
						
						
						
						
					 | 
					
						2017-04-15 13:05:15 +02:00 | 
					
					
						
						
							
							
							
						
					 | 
				
			
				
					
						
							
							
								 
								ines
							
						 
					 | 
					
						
						
						
						
							
						
						
							e71a1f4bd0
							
						
					 | 
					
						
						
							
							Fix download commands in error messages (see #946)
						
						
						
						
						
					 | 
					
						2017-04-01 10:20:57 +02:00 | 
					
					
						
						
							
							
							
						
					 | 
				
			
				
					
						
							
							
								 
								Matthew Honnibal
							
						 
					 | 
					
						
						
						
						
							
						
						
							fc3900e5b2
							
						
					 | 
					
						
						
							
							Allow ent_id to be set in Token
						
						
						
						
						
					 | 
					
						2017-03-31 14:00:14 +02:00 | 
					
					
						
						
							
							
							
						
					 | 
				
			
				
					
						
							
							
								 
								ines
							
						 
					 | 
					
						
						
						
						
							
						
						
							66c1f194f9
							
						
					 | 
					
						
						
							
							Use consistent unicode declarations
						
						
						
						
						
					 | 
					
						2017-03-12 13:07:28 +01:00 | 
					
					
						
						
							
							
							
						
					 | 
				
			
				
					
						
							
							
								 
								Roman Inflianskas
							
						 
					 | 
					
						
						
						
						
							
						
						
							66e1109b53
							
						
					 | 
					
						
						
							
							Add support for Universal Dependencies v2.0
						
						
						
						
						
					 | 
					
						2017-03-03 13:17:34 +01:00 | 
					
					
						
						
							
							
							
						
					 | 
				
			
				
					
						
							
							
								 
								Matthew Honnibal
							
						 
					 | 
					
						
						
						
						
							
						
						
							e7f8e13cf3
							
						
					 | 
					
						
						
							
							Make Token hashable. Fixes #743
						
						
						
						
						
					 | 
					
						2017-01-16 13:27:57 +01:00 | 
					
					
						
						
							
							
							
						
					 | 
				
			
				
					
						
							
							
								 
								Matthew Honnibal
							
						 
					 | 
					
						
						
						
						
							
						
						
							12cd27b821
							
						
					 | 
					
						
						
							
							Amend 8ae8b443f: Handle comparison with None tokens.
						
						
						
						
						
					 | 
					
						2017-01-11 13:03:32 +01:00 | 
					
					
						
						
							
							
							
						
					 | 
				
			
				
					
						
							
							
								 
								Matthew Honnibal
							
						 
					 | 
					
						
						
						
						
							
						
						
							8ae8b443f1
							
						
					 | 
					
						
						
							
							Add richcmp method to Token. Closes #631
						
						
						
						
						
					 | 
					
						2017-01-09 19:30:31 +01:00 | 
					
					
						
						
							
							
							
						
					 | 
				
			
				
					
						
							
							
								 
								Matthew Honnibal
							
						 
					 | 
					
						
						
						
						
							
						
						
							404019ad2f
							
						
					 | 
					
						
						
							
							Fix issue #672: ent_iob_ was a string, not unicode, due to missing unicode_literals statement.
						
						
						
						
						
					 | 
					
						2016-12-18 22:33:53 +01:00 | 
					
					
						
						
							
							
							
						
					 | 
				
			
				
					
						
							
							
								 
								Matthew Honnibal
							
						 
					 | 
					
						
						
						
						
							
						
						
							293c79c09a
							
						
					 | 
					
						
						
							
							Fix #595: Lemmatization was incorrect for base forms, because morphological analyser wasn't adding morphology properly.
						
						
						
						
						
					 | 
					
						2016-11-04 00:29:07 +01:00 | 
					
					
						
						
							
							
							
						
					 | 
				
			
				
					
						
							
							
								 
								Matthew Honnibal
							
						 
					 | 
					
						
						
						
						
							
						
						
							05a8b752a2
							
						
					 | 
					
						
						
							
							Fix Issue #600: Missing setters for Token attribute.
						
						
						
						
						
					 | 
					
						2016-11-02 23:28:59 +01:00 | 
					
					
						
						
							
							
							
						
					 | 
				
			
				
					
						
							
							
								 
								Matthew Honnibal
							
						 
					 | 
					
						
						
						
						
							
						
						
							11664b9f20
							
						
					 | 
					
						
						
							
							Fix variable error in token
						
						
						
						
						
					 | 
					
						2016-11-01 13:28:00 +01:00 | 
					
					
						
						
							
							
							
						
					 | 
				
			
				
					
						
							
							
								 
								Matthew Honnibal
							
						 
					 | 
					
						
						
						
						
							
						
						
							b86f8af0c1
							
						
					 | 
					
						
						
							
							Fix doc strings
						
						
						
						
						
					 | 
					
						2016-11-01 12:25:36 +01:00 | 
					
					
						
						
							
							
							
						
					 | 
				
			
				
					
						
							
							
								 
								Matthew Honnibal
							
						 
					 | 
					
						
						
						
						
							
						
						
							5d5742b773
							
						
					 | 
					
						
						
							
							Add sentiment field to doc, rename getters_for_tokens and getters_for_spans, add user_hooks field to Doc.
						
						
						
						
						
					 | 
					
						2016-10-19 20:54:22 +02:00 | 
					
					
						
						
							
							
							
						
					 | 
				
			
				
					
						
							
							
								 
								Matthew Honnibal
							
						 
					 | 
					
						
						
						
						
							
						
						
							7fd98fc91c
							
						
					 | 
					
						
						
							
							Remove deprecation shim around str/bytes in Token.
						
						
						
						
						
					 | 
					
						2016-10-17 14:02:47 +02:00 | 
					
					
						
						
							
							
							
						
					 | 
				
			
				
					
						
							
							
								 
								Matthew Honnibal
							
						 
					 | 
					
						
						
						
						
							
						
						
							c1abc8f6ed
							
						
					 | 
					
						
						
							
							Fix deprecation stuff in Token: Remove the shim for the str/unicode semantics, and raise for has_repvec and repvec
						
						
						
						
						
					 | 
					
						2016-10-17 11:18:41 +02:00 | 
					
					
						
						
							
							
							
						
					 | 
				
			
				
					
						
							
							
								 
								Matthew Honnibal
							
						 
					 | 
					
						
						
						
						
							
						
						
							5d10e2005c
							
						
					 | 
					
						
						
							
							Defer some attributes to Doc, via getters_for_tokens attribute.
						
						
						
						
						
					 | 
					
						2016-10-17 02:44:49 +02:00 | 
					
					
						
						
							
							
							
						
					 | 
				
			
				
					
						
							
							
								 
								Matthew Honnibal
							
						 
					 | 
					
						
						
						
						
							
						
						
							ca32a1ab01
							
						
					 | 
					
						
						
							
							Revert "Work on Issue #285: intern strings into document-specific pools, to address streaming data memory growth. StringStore.__getitem__ now raises KeyError when it can't find the string. Use StringStore.intern() to get the old behaviour. Still need to hunt down all uses of StringStore.__getitem__ in library and do testing, but logic looks good."
						
						
						
						
						
						
						
						This reverts commit 8423e8627f. 
						
					 | 
					
						2016-09-30 20:20:22 +02:00 | 
					
					
						
						
							
							
							
						
					 | 
				
			
				
					
						
							
							
								 
								Matthew Honnibal
							
						 
					 | 
					
						
						
						
						
							
						
						
							6736977d82
							
						
					 | 
					
						
						
							
							Revert "Changes to Doc and Token for new string store scheme"
						
						
						
						
						
						
						
						This reverts commit 99de44d864. 
						
					 | 
					
						2016-09-30 20:11:15 +02:00 | 
					
					
						
						
							
							
							
						
					 | 
				
			
				
					
						
							
							
								 
								Matthew Honnibal
							
						 
					 | 
					
						
						
						
						
							
						
						
							99de44d864
							
						
					 | 
					
						
						
							
							Changes to Doc and Token for new string store scheme
						
						
						
						
						
					 | 
					
						2016-09-30 20:00:21 +02:00 | 
					
					
						
						
							
							
							
						
					 | 
				
			
				
					
						
							
							
								 
								Matthew Honnibal
							
						 
					 | 
					
						
						
						
						
							
						
						
							8423e8627f
							
						
					 | 
					
						
						
							
							Work on Issue #285: intern strings into document-specific pools, to address streaming data memory growth. StringStore.__getitem__ now raises KeyError when it can't find the string. Use StringStore.intern() to get the old behaviour. Still need to hunt down all uses of StringStore.__getitem__ in library and do testing, but logic looks good.
						
						
						
						
						
					 | 
					
						2016-09-30 10:14:47 +02:00 | 
					
					
						
						
							
							
							
						
					 | 
				
			
				
					
						
							
							
								 
								Matthew Honnibal
							
						 
					 | 
					
						
						
						
						
							
						
						
							4de13606fd
							
						
					 | 
					
						
						
							
							Fix token.pyx
						
						
						
						
						
					 | 
					
						2016-09-23 15:07:07 +02:00 | 
					
					
						
						
							
							
							
						
					 | 
				
			
				
					
						
							
							
								 
								Matthew Honnibal
							
						 
					 | 
					
						
						
						
						
							
						
						
							b4de419e19
							
						
					 | 
					
						
						
							
							Import hash_t typedef in token.pyx
						
						
						
						
						
					 | 
					
						2016-09-23 14:22:06 +02:00 | 
					
					
						
						
							
							
							
						
					 | 
				
			
				
					
						
							
							
								 
								Matthew Honnibal
							
						 
					 | 
					
						
						
						
						
							
						
						
							c1a2e96604
							
						
					 | 
					
						
						
							
							Clean up notes at end of token.pyx
						
						
						
						
						
					 | 
					
						2016-09-21 20:45:51 +02:00 | 
					
					
						
						
							
							
							
						
					 | 
				
			
				
					
						
							
							
								 
								Matthew Honnibal
							
						 
					 | 
					
						
						
						
						
							
						
						
							58e83fe34b
							
						
					 | 
					
						
						
							
							Initial, limited support for quantified patterns in Matcher, and tracking of ent_id attribute in Token and Span. The quantifiers need a lot more testing, and there are some known problems. The main known problem is that the zero-plus and one-plus quantifiers won't work if a token can match both the quantified pattern expression AND the tail of the match.
						
						
						
						
						
					 | 
					
						2016-09-21 14:54:55 +02:00 | 
					
					
						
						
							
							
							
						
					 | 
				
			
				
					
						
							
							
								 
								Matthew Honnibal
							
						 
					 | 
					
						
						
						
						
							
						
						
							6df3858dbc
							
						
					 | 
					
						
						
							
							* Fix Issue #323: Incorrect semantics of Token.__str__ built-in. Add flag to allow users to switch the old semantics back on, to ease transition.
						
						
						
						
						
					 | 
					
						2016-04-12 13:17:59 +10:00 | 
					
					
						
						
							
							
							
						
					 | 
				
			
				
					
						
							
							
								 
								Matthew Honnibal
							
						 
					 | 
					
						
						
						
						
							
						
						
							872695759d
							
						
					 | 
					
						
						
							
							Merge pull request #306 from wbwseeker/german_noun_chunks
						
						
						
						
						
						
						
						add German noun chunk functionality 
						
					 | 
					
						2016-04-08 00:54:24 +10:00 | 
					
					
						
						
							
							
							
						
					 |