| 
							
							
								 Matthew Honnibal | 12699a1152 | * Set initial freqs, to avoid missing values in serializer | 2015-07-23 01:16:27 +02:00 |  | 
			
				
					| 
							
							
								 Matthew Honnibal | 317cbbc015 | * Serialization round trip now working with decent API, but with rough spots in the organisation and requiring vocabulary to be fixed ahead of time. | 2015-07-19 15:18:17 +02:00 |  | 
			
				
					| 
							
							
								 Matthew Honnibal | 4dddc8a69b | * Fix type declarations for attr_t. Remove unused id_t. | 2015-07-18 22:39:57 +02:00 |  | 
			
				
					| 
							
							
								 Matthew Honnibal | 95e57c2780 | * Remove unnecessary key and id properties from Utf8String. | 2015-07-17 01:40:18 +02:00 |  | 
			
				
					| 
							
							
								 Matthew Honnibal | 6eef0bf9ab | * Break up tokens.pyx into tokens/doc.pyx, tokens/token.pyx, tokens/spans.pyx | 2015-07-13 20:20:58 +02:00 |  | 
			
				
					| 
							
							
								 Matthew Honnibal | 89a91ad726 | * Add SPACE part-of-speech tag, and train tagger to assign it. Also train tagger not to make whitespace an entity | 2015-07-09 13:30:41 +02:00 |  | 
			
				
					| 
							
							
								 Matthew Honnibal | bb522496dd | * Rename Tokens to Doc | 2015-07-08 18:53:00 +02:00 |  | 
			
				
					| 
							
							
								 Matthew Honnibal | fb8d50b3d5 | Merge branch 'master' of ssh://github.com/honnibal/spaCy | 2015-04-30 12:45:15 +02:00 |  | 
			
				
					| 
							
							
								 Matthew Honnibal | 378c2a6435 | * Fix POS model: make it use tag instead of pos in history features | 2015-04-29 00:02:53 +02:00 |  | 
			
				
					| 
							
							
								 Jordan Suchow | 3a8d9b37a6 | Remove trailing whitespace | 2015-04-19 13:01:38 -07:00 |  | 
			
				
					| 
							
							
								 Matthew Honnibal | c6707778dd | * Fix Issue #51: Handle non-ascii lemmas correctly | 2015-04-13 22:28:59 +02:00 |  | 
			
				
					| 
							
							
								 Matthew Honnibal | 567388e38d | * Use values encoded by StringStore in POS tagging, rather than indices into a list of tags | 2015-03-26 16:44:45 +01:00 |  | 
			
				
					| 
							
							
								 Matthew Honnibal | 8cc3524dc9 | * Ws | 2015-03-26 16:44:41 +01:00 |  | 
			
				
					| 
							
							
								 Matthew Honnibal | caf046b220 | * Hastily add method to apply tags from a list of strings, instead of predicting the tags. | 2015-02-23 15:40:17 -05:00 |  | 
			
				
					| 
							
							
								 Matthew Honnibal | 0b7e769211 | * Add POS tags to support SWBD tag set | 2015-02-11 14:08:28 -05:00 |  | 
			
				
					| 
							
							
								 Matthew Honnibal | 312b3a45f3 | * Fix issue #19: Allow parsing/pos tagging of empty strings | 2015-02-10 10:15:58 -05:00 |  | 
			
				
					| 
							
							
								 Matthew Honnibal | 5c3513583d | * Clear buffered python tokens when modifying the Tokens object. Need to clean this up, and modify via a method on Tokens. | 2015-02-09 03:57:10 -05:00 |  | 
			
				
					| 
							
							
								 Matthew Honnibal | be5536d239 | * Fix Issue #22: PRP and PRP$ were mapped to NOUN. Should be PRON. | 2015-02-08 18:36:18 -05:00 |  | 
			
				
					| 
							
							
								 Matthew Honnibal | 56c2ef2982 | * Tweak POS features for web text | 2015-02-02 11:59:36 +11:00 |  | 
			
				
					| 
							
							
								 Matthew Honnibal | 024cfd485c | * Pass tag_strings as a tuple, to support new Tokens API | 2015-01-31 13:43:37 +11:00 |  | 
			
				
					| 
							
							
								 Matthew Honnibal | 67d6e53a69 | * Ensure parser and tagger function correctly when training from missing values, indicated by -1 | 2015-01-30 14:08:56 +11:00 |  | 
			
				
					| 
							
							
								 Matthew Honnibal | 12b034e3ef | * Move POS tag definitions to parts_of_speech.pxd | 2015-01-25 16:31:07 +11:00 |  | 
			
				
					| 
							
							
								 Matthew Honnibal | 7431c133d8 | * Add error if try to access head and not is_parsed | 2015-01-25 15:33:54 +11:00 |  | 
			
				
					| 
							
							
								 Matthew Honnibal | 4e857ab7a6 | * Fix bug in POS tagger feature | 2015-01-25 02:20:15 +11:00 |  | 
			
				
					| 
							
							
								 Matthew Honnibal | a97bed9359 | * Fix POS and dependency label tag names.  Add parse and string navigation functions. | 2015-01-24 17:29:04 +11:00 |  | 
			
				
					| 
							
							
								 Matthew Honnibal | 5ed8b2b98f | * Rename sic to orth | 2015-01-23 02:08:25 +11:00 |  | 
			
				
					| 
							
							
								 Matthew Honnibal | 6c7e44140b | * Work on word vectors, and other stuff | 2015-01-17 16:21:17 +11:00 |  | 
			
				
					| 
							
							
								 Matthew Honnibal | 0930892fc1 | * Tmp. Working on refactor. Compiles, must hook up lexical feats. | 2015-01-14 00:03:48 +11:00 |  | 
			
				
					| 
							
							
								 Matthew Honnibal | 46da3d74d2 | * Tmp. Refactoring, introducing a Lexeme PyObject. | 2015-01-12 11:23:44 +11:00 |  | 
			
				
					| 
							
							
								 Matthew Honnibal | ce2edd6312 | * Tmp commit. Refactoring to create a Python Lexeme class. | 2015-01-12 10:26:22 +11:00 |  | 
			
				
					| 
							
							
								 Matthew Honnibal | 3f1944d688 | * Make PyPy work | 2015-01-05 17:54:38 +11:00 |  | 
			
				
					| 
							
							
								 Matthew Honnibal | 94034f1112 | * Fix encoding in lemmatization | 2015-01-05 11:54:29 +11:00 |  | 
			
				
					| 
							
							
								 Matthew Honnibal | 0e4c2ba036 | * Fix loading of special morph words | 2015-01-03 23:13:00 +11:00 |  | 
			
				
					| 
							
							
								 Matthew Honnibal | 5d9a096e2f | * Some minor clean-up after HastyModel | 2014-12-31 19:46:04 +11:00 |  | 
			
				
					| 
							
							
								 Matthew Honnibal | aafaf58cbe | * Refactor _ml.Model, and finish implementing HastyModel so far not worthwhile. | 2014-12-31 19:40:59 +11:00 |  | 
			
				
					| 
							
							
								 Matthew Honnibal | 1a075f77ff | * Don't over-ride pre-loaded POS tags, if set by special-cases | 2014-12-30 23:26:32 +11:00 |  | 
			
				
					| 
							
							
								 Matthew Honnibal | bb0b00f819 | * Repurporse the Tagger class as a generic Model, wrapping thinc's interface | 2014-12-30 21:20:15 +11:00 |  | 
			
				
					| 
							
							
								 Matthew Honnibal | bb80937544 | * Upd docstrings | 2014-12-27 18:45:16 +11:00 |  | 
			
				
					| 
							
							
								 Matthew Honnibal | b8b65903fc | * Tmp | 2014-12-24 17:42:00 +11:00 |  | 
			
				
					| 
							
							
								 Matthew Honnibal | b00bc01d8c | * All tests now passing for reorg | 2014-12-23 13:18:59 +11:00 |  | 
			
				
					| 
							
							
								 Matthew Honnibal | 73f200436f | * Tests passing except for morphology/lemmatization stuff | 2014-12-23 11:40:32 +11:00 |  | 
			
				
					| 
							
							
								 Matthew Honnibal | 61df50b598 | * Add English-subclass POS tagger | 2014-12-21 20:59:07 +11:00 |  |