| 
							
							
								 Matthew Honnibal | f00afe12c4 | * Load POS tagger in load() function if path exists | 2014-12-07 22:05:57 +11:00 |  | 
			
				
					| 
							
							
								 Matthew Honnibal | 677e111ee7 | * Revise tokenization rules to match PTB. Rules are pretty messy around periods, need better support for these. | 2014-12-07 22:04:47 +11:00 |  | 
			
				
					| 
							
							
								 Matthew Honnibal | 5fe5e6e66b | * Move context functions to header, inlining them. | 2014-12-07 21:59:04 +11:00 |  | 
			
				
					| 
							
							
								 Matthew Honnibal | 91e8d9ea1c | * Compile context.pyx and tagger.pyx modules | 2014-12-07 15:29:54 +11:00 |  | 
			
				
					| 
							
							
								 Matthew Honnibal | 5caabec789 | * Link in tagger, to work on integrating POS tagging | 2014-12-07 15:29:41 +11:00 |  | 
			
				
					| 
							
							
								 Matthew Honnibal | 0c7aeb9de7 | * Begin revising tagger, focussing on POS tagging | 2014-12-07 15:29:04 +11:00 |  | 
			
				
					| 
							
							
								 Matthew Honnibal | f5c4f2eb52 | * Revise context, focussing on POS tagging for now | 2014-12-07 15:28:22 +11:00 |  | 
			
				
					| 
							
							
								 Matthew Honnibal | e27b912ef9 | * Remove need for confusing _data pointer to be stored on Tokens | 2014-12-05 16:31:30 +11:00 |  | 
			
				
					| 
							
							
								 Matthew Honnibal | 1c9253701d | * Introduce a TokenC struct, to handle token indices, pos tags and sense tags | 2014-12-05 15:56:14 +11:00 |  | 
			
				
					| 
							
							
								 Matthew Honnibal | 187372c7f3 | * Allow the lexicon to create lexemes using an external memory pool, so that it can decide to make some lexemes temporary, rather than cached | 2014-12-05 03:29:50 +11:00 |  | 
			
				
					| 
							
							
								 Matthew Honnibal | 75b8dfb348 | * Remove upper_pc from lexeme.pyx | 2014-12-04 22:14:34 +11:00 |  | 
			
				
					| 
							
							
								 Matthew Honnibal | a14f9eaf63 | * Add index.pyx to setup | 2014-12-04 22:14:11 +11:00 |  | 
			
				
					| 
							
							
								 Matthew Honnibal | 49f3780ff5 | * Fiddle with lexeme attrs | 2014-12-04 21:22:38 +11:00 |  | 
			
				
					| 
							
							
								 Matthew Honnibal | 564082e48e | * Hack Token class to take lex.dense inplace of the old lex.norm. This needs to be fixed... | 2014-12-04 20:51:29 +11:00 |  | 
			
				
					| 
							
							
								 Matthew Honnibal | 69bb022204 | * Add as_array and count_by method | 2014-12-04 20:46:55 +11:00 |  | 
			
				
					| 
							
							
								 Matthew Honnibal | e1b1f45cc9 | * Add STEM attribute to lexeme | 2014-12-04 20:46:20 +11:00 |  | 
			
				
					| 
							
							
								 Matthew Honnibal | d7952634ca | * Make the string-store serve const pointers to Utf8Str | 2014-12-03 16:01:47 +11:00 |  | 
			
				
					| 
							
							
								 Matthew Honnibal | 7e04c22f8f | * const added to Lexicon interface. Seems to work. | 2014-12-03 15:58:17 +11:00 |  | 
			
				
					| 
							
							
								 Matthew Honnibal | d70d31aa45 | * Introduce first attempt at const-ness | 2014-12-03 15:44:25 +11:00 |  | 
			
				
					| 
							
							
								 Matthew Honnibal | d0d812c548 | * Hack setup.py to exclude tagger stuff | 2014-12-03 11:06:57 +11:00 |  | 
			
				
					| 
							
							
								 Matthew Honnibal | 4560ada85b | * Add typedef for attr_t. Change flag_t to flags_t | 2014-12-03 11:06:31 +11:00 |  | 
			
				
					| 
							
							
								 Matthew Honnibal | e600f7b327 | * Move String struct stuff into the utf8string module, from spacy.lang | 2014-12-03 11:06:00 +11:00 |  | 
			
				
					| 
							
							
								 Matthew Honnibal | e170faf5b0 | * Hack Tokens to work without tagger.pyx | 2014-12-03 11:05:15 +11:00 |  | 
			
				
					| 
							
							
								 Matthew Honnibal | b463a7eb86 | * Make flag-setting a language-specific thing | 2014-12-03 11:04:32 +11:00 |  | 
			
				
					| 
							
							
								 Matthew Honnibal | 71b009e323 | * Fix bug in refactored StringStore.__getitem__ | 2014-12-03 11:02:24 +11:00 |  | 
			
				
					| 
							
							
								 Matthew Honnibal | 14097311ae | * Make StringStore.__getitem__ accept unicode-typed keys. | 2014-12-03 01:33:20 +11:00 |  | 
			
				
					| 
							
							
								 Matthew Honnibal | 522bb0346e | * Work on get_array method of Tokens | 2014-12-02 23:48:05 +11:00 |  | 
			
				
					| 
							
							
								 Matthew Honnibal | 8c2938fe01 | * Rename Lexicon._dict to Lexicon._map | 2014-12-02 23:46:59 +11:00 |  | 
			
				
					| 
							
							
								 Matthew Honnibal | 2ee8a1e61f | * Make intro chattier, explain philosophy better | 2014-12-02 15:20:18 +11:00 |  | 
			
				
					| 
							
							
								 Matthew Honnibal | ea19850a69 | * Add tokenizer section | 2014-12-02 04:39:12 +11:00 |  | 
			
				
					| 
							
							
								 Matthew Honnibal | 3430d5f629 | * Revise intro copy. Add NLTK comparison | 2014-12-01 22:55:13 +11:00 |  | 
			
				
					| 
							
							
								 Matthew Honnibal | 33dfb4933c | * Remove taggers from Language class. Work on doc strings | 2014-11-26 19:53:55 +11:00 |  | 
			
				
					| 
							
							
								 Matthew Honnibal | 80baa2e3db | * Work on beam parser | 2014-11-20 19:49:33 +11:00 |  | 
			
				
					| 
							
							
								 Matthew Honnibal | 5c3016bac8 | * Tmp commit of ner code | 2014-11-14 18:27:47 +11:00 |  | 
			
				
					| 
							
							
								 Matthew Honnibal | 33c421bcf8 | * More feature tweaks | 2014-11-12 23:59:16 +11:00 |  | 
			
				
					| 
							
							
								 Matthew Honnibal | 41dedfb14e | * Add label features for NER parsing | 2014-11-12 23:55:10 +11:00 |  | 
			
				
					| 
							
							
								 Matthew Honnibal | cf55b48ba6 | * Switch to predict label on shift. Big increase in accuracy. | 2014-11-12 23:50:12 +11:00 |  | 
			
				
					| 
							
							
								 Matthew Honnibal | 8f84e8a78b | * Neaten oracle | 2014-11-12 23:38:07 +11:00 |  | 
			
				
					| 
							
							
								 Matthew Honnibal | 66cb4f96e1 | * Upd gitignore | 2014-11-12 23:25:27 +11:00 |  | 
			
				
					| 
							
							
								 Matthew Honnibal | 60c1e78596 | * Commit outstanding tests | 2014-11-12 23:24:32 +11:00 |  | 
			
				
					| 
							
							
								 Matthew Honnibal | 7e0a9077dd | * Add context files | 2014-11-12 23:22:36 +11:00 |  | 
			
				
					| 
							
							
								 Matthew Honnibal | 9b13392ac7 | * Add conll experiments | 2014-11-12 23:22:05 +11:00 |  | 
			
				
					| 
							
							
								 Matthew Honnibal | b934bf1c69 | * Compile IOB | 2014-11-12 23:21:40 +11:00 |  | 
			
				
					| 
							
							
								 Matthew Honnibal | 3b0b902384 | * IOB-style parsing working. Accuracy down from BILOU, form 87-88 to 85-86 | 2014-11-12 23:21:09 +11:00 |  | 
			
				
					| 
							
							
								 Matthew Honnibal | e6bb8aa3a9 | * Move moves to bilou_moves. Refactor context, returning to the simpler giant-enum style | 2014-11-12 00:54:50 +11:00 |  | 
			
				
					| 
							
							
								 Matthew Honnibal | c788633429 | * Add tokens_from_list method to Language | 2014-11-11 23:43:14 +11:00 |  | 
			
				
					| 
							
							
								 Matthew Honnibal | da70b6bd60 | * Upd tokenization special-cases | 2014-11-11 22:10:15 +11:00 |  | 
			
				
					| 
							
							
								 Matthew Honnibal | 95282d4993 | * Use the dynamic oracle 'follow' strategy | 2014-11-11 21:11:17 +11:00 |  | 
			
				
					| 
							
							
								 Matthew Honnibal | 60ffdc2eb7 | * Upd fabfile | 2014-11-11 21:10:40 +11:00 |  | 
			
				
					| 
							
							
								 Matthew Honnibal | d5e9dce039 | * Compile ner NER code | 2014-11-11 21:10:22 +11:00 |  |