| 
							
							
								 Matthew Honnibal | fe442cac53 | Fix #717: Set correct lemma for contracted verbs | 2017-03-18 16:16:10 +01:00 |  | 
			
				
					| 
							
							
								 Matthew Honnibal | 8dbff4f5f4 | Wire up English lemma and morph rules. | 2017-03-15 09:23:22 -05:00 |  | 
			
				
					| 
							
							
								 ines | ce9568af84 | Move English time exceptions ("1a.m." etc.) and refactor | 2017-03-12 13:58:22 +01:00 |  | 
			
				
					| 
							
							
								 ines | 6b30541774 | Fix formatting | 2017-03-12 13:58:22 +01:00 |  | 
			
				
					| 
							
							
								 ines | 66c1f194f9 | Use consistent unicode declarations | 2017-03-12 13:07:28 +01:00 |  | 
			
				
					| 
							
							
								 ines | 30ce2a6793 | Exclude "shed" and "Shed" from tokenizer exceptions (see #847) | 2017-02-18 14:10:44 +01:00 |  | 
			
				
					| 
							
							
								 Ines Montani | 209c37bbcf | Exclude "shell" and "Shell" from English tokenizer exceptions (resolves #775) | 2017-01-25 13:15:02 +01:00 |  | 
			
				
					| 
							
							
								 Ines Montani | 50878ef598 | Exclude "were" and "Were" from tokenizer exceptions and add regression test (resolves #744) | 2017-01-16 13:10:38 +01:00 |  | 
			
				
					| 
							
							
								 Matthew Honnibal | fba67fa342 | Fix Issue #736: Times were being tokenized with incorrect string values. | 2017-01-12 11:21:01 +01:00 |  | 
			
				
					| 
							
							
								 Ines Montani | 0dec90e9f7 | Use global abbreviation data languages and remove duplicates | 2017-01-08 20:36:00 +01:00 |  | 
			
				
					| 
							
							
								 Ines Montani | cab39c59c5 | Add missing contractions to English tokenizer exceptions Inspired by
https://github.com/kootenpv/contractions/blob/master/contractions/__init
__.py | 2017-01-05 19:59:06 +01:00 |  | 
			
				
					| 
							
							
								 Ines Montani | a23504fe07 | Move abbreviations below other exceptions | 2017-01-05 19:58:07 +01:00 |  | 
			
				
					| 
							
							
								 Ines Montani | 7d2cf934b9 | Generate he/she/it correctly with 's instead of 've | 2017-01-05 19:57:00 +01:00 |  | 
			
				
					| 
							
							
								 Ines Montani | bc911322b3 | Move ") to emoticons (see Tweebo challenge test) | 2017-01-05 18:05:38 +01:00 |  | 
			
				
					| 
							
							
								 Ines Montani | 1d237664af | Add lowercase lemma to tokenizer exceptions | 2017-01-03 23:02:21 +01:00 |  | 
			
				
					| 
							
							
								 Ines Montani | 84a87951eb | Fix typos | 2017-01-03 18:27:43 +01:00 |  | 
			
				
					| 
							
							
								 Ines Montani | 35b39f53c3 | Reorganise English tokenizer exceptions (as discussed in #718) Add logic to generate exceptions that follow a consistent pattern (like
verbs and pronouns) and allow certain tokens to be excluded explicitly. | 2017-01-03 18:26:09 +01:00 |  | 
			
				
					| 
							
							
								 Ines Montani | 461cbb99d8 | Revert "Reorganise English tokenizer exceptions (as discussed in #718)" This reverts commit b19cfcc144. | 2017-01-03 18:21:29 +01:00 |  | 
			
				
					| 
							
							
								 Ines Montani | b19cfcc144 | Reorganise English tokenizer exceptions (as discussed in #718) Add logic to generate exceptions that follow a consistent pattern (like
verbs and pronouns) and allow certain tokens to be excluded explicitly. | 2017-01-03 18:17:57 +01:00 |  | 
			
				
					| 
							
							
								 Ines Montani | 78e63dc7d0 | Update tokenizer exceptions for English | 2016-12-21 18:06:34 +01:00 |  | 
			
				
					| 
							
							
								 Ines Montani | 704c7442e0 | Break language data components into their own files | 2016-12-18 15:36:53 +01:00 |  |