| 
							
							
								 Matthew Honnibal | 9e00798820 | * Work on integrating a greedy dependency parser | 2014-12-16 08:06:04 +11:00 |  | 
			
				
					| 
							
							
								 Matthew Honnibal | 24ffc32f2f | * Another redraft of index.rst | 2014-12-15 16:32:03 +11:00 |  | 
			
				
					| 
							
							
								 Matthew Honnibal | 77dd7a212a | * More thoughts on intro | 2014-12-15 09:19:29 +11:00 |  | 
			
				
					| 
							
							
								 Matthew Honnibal | 792802b2b9 | * POS tag memoisation working, with good speed-up | 2014-12-12 14:33:51 +11:00 |  | 
			
				
					| 
							
							
								 Matthew Honnibal | ca54d58638 | * Merge setup.py | 2014-12-10 15:21:27 +11:00 |  | 
			
				
					| 
							
							
								 Matthew Honnibal | 9959a64f7b | * Working morphology and lemmatisation. POS tagging quite fast. | 2014-12-10 08:09:32 +11:00 |  | 
			
				
					| 
							
							
								 Matthew Honnibal | 7831b06610 | * Compile morphology.pyx file | 2014-12-10 08:09:13 +11:00 |  | 
			
				
					| 
							
							
								 Matthew Honnibal | df3be14987 | * Add pos_type features to POS tagger | 2014-12-10 08:08:55 +11:00 |  | 
			
				
					| 
							
							
								 Matthew Honnibal | 42973c4b37 | * Improve efficiency of tagger, and improve morphological processing | 2014-12-10 01:02:04 +11:00 |  | 
			
				
					| 
							
							
								 Matthew Honnibal | 6b34a2f34b | * Move morphological analysis into its own module, morphology.pyx | 2014-12-09 21:16:17 +11:00 |  | 
			
				
					| 
							
							
								 Matthew Honnibal | b962fe73d7 | * Make suffixes file use full-power regex, so that we can handle periods properly | 2014-12-09 19:04:27 +11:00 |  | 
			
				
					| 
							
							
								 Matthew Honnibal | accdbe989b | * Remove Tokens.extend method | 2014-12-09 17:09:23 +11:00 |  | 
			
				
					| 
							
							
								 Matthew Honnibal | 495e1c7366 | * Use fused type in Tokens.push_back, simplifying the use of the cache | 2014-12-09 16:50:01 +11:00 |  | 
			
				
					| 
							
							
								 Matthew Honnibal | 516f0f1e14 | * Remove test for loading ad hoc rules format | 2014-12-09 16:08:45 +11:00 |  | 
			
				
					| 
							
							
								 Matthew Honnibal | 6369835306 | * Add false positive test for emoticons | 2014-12-09 16:08:17 +11:00 |  | 
			
				
					| 
							
							
								 Matthew Honnibal | f15deaad5b | * Upd docs | 2014-12-09 16:08:01 +11:00 |  | 
			
				
					| 
							
							
								 Matthew Honnibal | 1ccabc806e | * Work on lemmatization | 2014-12-09 16:06:18 +11:00 |  | 
			
				
					| 
							
							
								 Matthew Honnibal | 2a6bd2818f | * Load the lexicon before we check flag values | 2014-12-09 15:18:43 +11:00 |  | 
			
				
					| 
							
							
								 Matthew Honnibal | 302e09018b | * Work on fixing special-cases, reading them in as JSON objects so that they can specify lemmas | 2014-12-09 14:48:01 +11:00 |  | 
			
				
					| 
							
							
								 Matthew Honnibal | cda9ea9a4a | * Add test to make sure iterating over the lexicon isnt broken | 2014-12-08 21:12:51 +11:00 |  | 
			
				
					| 
							
							
								 Matthew Honnibal | 99bbbb6feb | * Work on morphological processing | 2014-12-08 21:12:15 +11:00 |  | 
			
				
					| 
							
							
								 Matthew Honnibal | 7b68f911cf | * Add WordNet lemmatizer | 2014-12-08 01:39:13 +11:00 |  | 
			
				
					| 
							
							
								 Matthew Honnibal | c20dd79748 | * Fiddle with const correctness and comments | 2014-12-08 00:03:55 +11:00 |  | 
			
				
					| 
							
							
								 Matthew Honnibal | b031c7c430 | * Remove language-general context module | 2014-12-07 23:53:01 +11:00 |  | 
			
				
					| 
							
							
								 Matthew Honnibal | ef4398b204 | * Rearrange POS stuff, so that language-specific stuff can live in language-specific modules | 2014-12-07 23:52:41 +11:00 |  | 
			
				
					| 
							
							
								 Matthew Honnibal | 327383e38a | * Remove unused code in tagger.pyx | 2014-12-07 22:16:17 +11:00 |  | 
			
				
					| 
							
							
								 Matthew Honnibal | 8f2f319c57 | * Add a couple more contractions tests | 2014-12-07 22:08:04 +11:00 |  | 
			
				
					| 
							
							
								 Matthew Honnibal | 9f17467c2e | * Fix EMPTY_TOKEN | 2014-12-07 22:07:41 +11:00 |  | 
			
				
					| 
							
							
								 Matthew Honnibal | 3819a88e1b | * Add support for tag dictionary, and fix error-code for predict method | 2014-12-07 22:07:16 +11:00 |  | 
			
				
					| 
							
							
								 Matthew Honnibal | f00afe12c4 | * Load POS tagger in load() function if path exists | 2014-12-07 22:05:57 +11:00 |  | 
			
				
					| 
							
							
								 Matthew Honnibal | 677e111ee7 | * Revise tokenization rules to match PTB. Rules are pretty messy around periods, need better support for these. | 2014-12-07 22:04:47 +11:00 |  | 
			
				
					| 
							
							
								 Matthew Honnibal | 5fe5e6e66b | * Move context functions to header, inlining them. | 2014-12-07 21:59:04 +11:00 |  | 
			
				
					| 
							
							
								 Matthew Honnibal | 91e8d9ea1c | * Compile context.pyx and tagger.pyx modules | 2014-12-07 15:29:54 +11:00 |  | 
			
				
					| 
							
							
								 Matthew Honnibal | 5caabec789 | * Link in tagger, to work on integrating POS tagging | 2014-12-07 15:29:41 +11:00 |  | 
			
				
					| 
							
							
								 Matthew Honnibal | 0c7aeb9de7 | * Begin revising tagger, focussing on POS tagging | 2014-12-07 15:29:04 +11:00 |  | 
			
				
					| 
							
							
								 Matthew Honnibal | f5c4f2eb52 | * Revise context, focussing on POS tagging for now | 2014-12-07 15:28:22 +11:00 |  | 
			
				
					| 
							
							
								 Matthew Honnibal | e27b912ef9 | * Remove need for confusing _data pointer to be stored on Tokens | 2014-12-05 16:31:30 +11:00 |  | 
			
				
					| 
							
							
								 Matthew Honnibal | 1c9253701d | * Introduce a TokenC struct, to handle token indices, pos tags and sense tags | 2014-12-05 15:56:14 +11:00 |  | 
			
				
					| 
							
							
								 Matthew Honnibal | 187372c7f3 | * Allow the lexicon to create lexemes using an external memory pool, so that it can decide to make some lexemes temporary, rather than cached | 2014-12-05 03:29:50 +11:00 |  | 
			
				
					| 
							
							
								 Matthew Honnibal | 75b8dfb348 | * Remove upper_pc from lexeme.pyx | 2014-12-04 22:14:34 +11:00 |  | 
			
				
					| 
							
							
								 Matthew Honnibal | a14f9eaf63 | * Add index.pyx to setup | 2014-12-04 22:14:11 +11:00 |  | 
			
				
					| 
							
							
								 Matthew Honnibal | 49f3780ff5 | * Fiddle with lexeme attrs | 2014-12-04 21:22:38 +11:00 |  | 
			
				
					| 
							
							
								 Matthew Honnibal | 564082e48e | * Hack Token class to take lex.dense inplace of the old lex.norm. This needs to be fixed... | 2014-12-04 20:51:29 +11:00 |  | 
			
				
					| 
							
							
								 Matthew Honnibal | 69bb022204 | * Add as_array and count_by method | 2014-12-04 20:46:55 +11:00 |  | 
			
				
					| 
							
							
								 Matthew Honnibal | e1b1f45cc9 | * Add STEM attribute to lexeme | 2014-12-04 20:46:20 +11:00 |  | 
			
				
					| 
							
							
								 Matthew Honnibal | d7952634ca | * Make the string-store serve const pointers to Utf8Str | 2014-12-03 16:01:47 +11:00 |  | 
			
				
					| 
							
							
								 Matthew Honnibal | 7e04c22f8f | * const added to Lexicon interface. Seems to work. | 2014-12-03 15:58:17 +11:00 |  | 
			
				
					| 
							
							
								 Matthew Honnibal | d70d31aa45 | * Introduce first attempt at const-ness | 2014-12-03 15:44:25 +11:00 |  | 
			
				
					| 
							
							
								 Matthew Honnibal | d0d812c548 | * Hack setup.py to exclude tagger stuff | 2014-12-03 11:06:57 +11:00 |  | 
			
				
					| 
							
							
								 Matthew Honnibal | 4560ada85b | * Add typedef for attr_t. Change flag_t to flags_t | 2014-12-03 11:06:31 +11:00 |  |