| 
							
							
								 Matthew Honnibal | 6209d94f83 | * Add tests for word shape | 2014-08-30 19:00:10 +02:00 |  | 
			
				
					| 
							
							
								 Matthew Honnibal | c282e6d5fb | * Redesign proceeding | 2014-08-28 19:45:09 +02:00 |  | 
			
				
					| 
							
							
								 Matthew Honnibal | fd4e61e58b | * Fixed contraction tests. Need to correct problem with the way case stats and tag stats are supposed to work. | 2014-08-27 20:22:33 +02:00 |  | 
			
				
					| 
							
							
								 Matthew Honnibal | fdaf24604a | * Basic punct tests updated and passing | 2014-08-27 19:38:57 +02:00 |  | 
			
				
					| 
							
							
								 Matthew Honnibal | 9815c7649e | * Refactor around Word objects, adapting tests. Tests passing, except for string views. | 2014-08-23 19:55:06 +02:00 |  | 
			
				
					| 
							
							
								 Matthew Honnibal | 6f83dca218 | * Fix import for ptb tokenization test | 2014-08-22 17:05:44 +02:00 |  | 
			
				
					| 
							
							
								 Matthew Honnibal | 4bcdd6d31c | * Further improvements to spacy docs, tweaks to code. | 2014-08-22 04:20:24 +02:00 |  | 
			
				
					| 
							
							
								 Matthew Honnibal | 01469b0888 | * Refactor spacy so that chunks return arrays of lexemes, so that there is properly one lexeme per word. | 2014-08-18 19:14:00 +02:00 |  | 
			
				
					| 
							
							
								 Matthew Honnibal | b555e2dc5d | * Add hash tests | 2014-08-02 21:58:31 +01:00 |  | 
			
				
					| 
							
							
								 Matthew Honnibal | 6319ff0f22 | * Add length property | 2014-08-02 21:26:44 +01:00 |  | 
			
				
					| 
							
							
								 Matthew Honnibal | e494494d80 | * Add tests for group_by | 2014-07-23 17:36:12 +01:00 |  | 
			
				
					| 
							
							
								 Matthew Honnibal | bc6c1f6156 | * Add test for open apostrophe bug | 2014-07-07 23:24:20 +02:00 |  | 
			
				
					| 
							
							
								 Matthew Honnibal | e60b958b7d | * Add test to check how well we match ptb tokenizer. Needs more text. | 2014-07-07 05:11:31 +02:00 |  | 
			
				
					| 
							
							
								 Matthew Honnibal | 2c431f9fdc | * Upd tokenization test | 2014-07-07 05:11:04 +02:00 |  | 
			
				
					| 
							
							
								 Matthew Honnibal | 25849fc926 | * Generalize tokenization rules to capitals | 2014-07-07 05:07:21 +02:00 |  | 
			
				
					| 
							
							
								 Matthew Honnibal | e4263a241a | * Tests passing for reorganized version | 2014-07-07 04:23:46 +02:00 |  | 
			
				
					| 
							
							
								 Matthew Honnibal | 12f8a0e3c2 | * Tests passing for reorganized version | 2014-07-07 04:23:20 +02:00 |  | 
			
				
					| 
							
							
								 Matthew Honnibal | a62c38e1ef | * Working tokenization. en doesn't match PTB perfectly. Need to reorganize before adding more schemes. | 2014-07-07 01:15:59 +02:00 |  | 
			
				
					| 
							
							
								 Matthew Honnibal | 4e79446dc2 | * Reading in tokenization rules correctly. Passing tests. | 2014-07-07 00:02:55 +02:00 |  | 
			
				
					| 
							
							
								 Matthew Honnibal | 9bef797afe | * Rejigged tests. Working possessives, but no other contractions | 2014-07-06 20:02:00 +02:00 |  | 
			
				
					| 
							
							
								 Matthew Honnibal | 556f6a18ca | * Initial commit. Tests passing for punctuation handling. Need contractions, file transport, tokenize function, etc. | 2014-07-05 20:51:42 +02:00 |  |