| 
							
							
								 Ines Montani | 347c4a2d06 | Reorganise and reformat global tokenizer prefixes, suffixes and infixes | 2017-01-08 20:37:39 +01:00 |  | 
			
				
					| 
							
							
								 Ines Montani | 0dec90e9f7 | Use global abbreviation data languages and remove duplicates | 2017-01-08 20:36:00 +01:00 |  | 
			
				
					| 
							
							
								 Ines Montani | 7c3cb2a652 | Add global abbreviations data | 2017-01-08 20:34:03 +01:00 |  | 
			
				
					| 
							
							
								 Ines Montani | de5aa92bc2 | Handle deprecated tokenizer prefix data | 2017-01-08 20:33:28 +01:00 |  | 
			
				
					| 
							
							
								 Ines Montani | abb09782f9 | Move sun.txt to original location and fix path to not break parser tests | 2017-01-08 20:32:54 +01:00 |  | 
			
				
					| 
							
							
								 Ines Montani | 57919566b8 | Add Jupyter notebooks repo to resources list | 2017-01-05 20:50:08 +01:00 |  | 
			
				
					| 
							
							
								 Ines Montani | cab39c59c5 | Add missing contractions to English tokenizer exceptions Inspired by
https://github.com/kootenpv/contractions/blob/master/contractions/__init
__.py | 2017-01-05 19:59:06 +01:00 |  | 
			
				
					| 
							
							
								 Ines Montani | a23504fe07 | Move abbreviations below other exceptions | 2017-01-05 19:58:07 +01:00 |  | 
			
				
					| 
							
							
								 Ines Montani | 7d2cf934b9 | Generate he/she/it correctly with 's instead of 've | 2017-01-05 19:57:00 +01:00 |  | 
			
				
					| 
							
							
								 Ines Montani | 8328925e1f | Add newlines to long German text | 2017-01-05 18:13:30 +01:00 |  | 
			
				
					| 
							
							
								 Ines Montani | 55b46d7cf6 | Add tokenizer tests for German | 2017-01-05 18:11:25 +01:00 |  | 
			
				
					| 
							
							
								 Ines Montani | 5bb4081f52 | Remove redundant test_tokenizer.py for English | 2017-01-05 18:11:11 +01:00 |  | 
			
				
					| 
							
							
								 Ines Montani | 8216ba599b | Add tests for longer and mixed English texts | 2017-01-05 18:11:04 +01:00 |  | 
			
				
					| 
							
							
								 Ines Montani | 65f937d5c6 | Move basic contraction tests to test_contractions.py | 2017-01-05 18:09:53 +01:00 |  | 
			
				
					| 
							
							
								 Ines Montani | bbe7cab3a1 | Move non-English-specific tests back to general tokenizer tests | 2017-01-05 18:09:29 +01:00 |  | 
			
				
					| 
							
							
								 Ines Montani | 038002d616 | Reformat HU tokenizer tests and adapt to general style Improve readability of test cases and add conftest.py with fixture | 2017-01-05 18:06:44 +01:00 |  | 
			
				
					| 
							
							
								 Ines Montani | bc911322b3 | Move ") to emoticons (see Tweebo challenge test) | 2017-01-05 18:05:38 +01:00 |  | 
			
				
					| 
							
							
								 Ines Montani | 637f785036 | Add general sanity tests for all tokenizers | 2017-01-05 16:25:38 +01:00 |  | 
			
				
					| 
							
							
								 Ines Montani | c5f2dc15de | Move English tokenizer tests to directory /en | 2017-01-05 16:25:04 +01:00 |  | 
			
				
					| 
							
							
								 Ines Montani | 8b45363b4d | Modernize and merge general tokenizer tests | 2017-01-05 13:17:05 +01:00 |  | 
			
				
					| 
							
							
								 Ines Montani | 02cfda48c9 | Modernize and merge tokenizer tests for string loading | 2017-01-05 13:16:55 +01:00 |  | 
			
				
					| 
							
							
								 Ines Montani | a11f684822 | Modernize and merge tokenizer tests for whitespace | 2017-01-05 13:16:33 +01:00 |  | 
			
				
					| 
							
							
								 Ines Montani | 8b284fc6f1 | Modernize and merge tokenizer tests for text from file | 2017-01-05 13:15:52 +01:00 |  | 
			
				
					| 
							
							
								 Ines Montani | 2c2e878653 | Modernize and merge tokenizer tests for punctuation | 2017-01-05 13:14:16 +01:00 |  | 
			
				
					| 
							
							
								 Ines Montani | 8a74129cdf | Modernize and merge tokenizer tests for prefixes/suffixes/infixes | 2017-01-05 13:13:12 +01:00 |  | 
			
				
					| 
							
							
								 Ines Montani | 0e65dca9a5 | Modernize and merge tokenizer tests for exception and emoticons | 2017-01-05 13:11:31 +01:00 |  | 
			
				
					| 
							
							
								 Ines Montani | 34c47bb20d | Fix formatting | 2017-01-05 13:10:51 +01:00 |  | 
			
				
					| 
							
							
								 Ines Montani | 2e72683baa | Add missing docstrings | 2017-01-05 13:10:21 +01:00 |  | 
			
				
					| 
							
							
								 Ines Montani | da10a049a6 | Add unicode declarations | 2017-01-05 13:09:48 +01:00 |  | 
			
				
					| 
							
							
								 Ines Montani | 58adae8774 | Remove unused file | 2017-01-05 13:09:22 +01:00 |  | 
			
				
					| 
							
							
								 Ines Montani | c6e5a5349d | Move regression test for #360 into own file | 2017-01-04 00:49:31 +01:00 |  | 
			
				
					| 
							
							
								 Ines Montani | 8279993a6f | Modernize and merge tokenizer tests for punctuation | 2017-01-04 00:49:20 +01:00 |  | 
			
				
					| 
							
							
								 Ines Montani | 550630df73 | Update tokenizer tests for contractions | 2017-01-04 00:48:42 +01:00 |  | 
			
				
					| 
							
							
								 Ines Montani | 109f202e8f | Update conftest fixture | 2017-01-04 00:48:21 +01:00 |  | 
			
				
					| 
							
							
								 Ines Montani | ee6b49b293 | Modernize tokenizer tests for emoticons | 2017-01-04 00:47:59 +01:00 |  | 
			
				
					| 
							
							
								 Ines Montani | f09b5a5dfd | Modernize tokenizer tests for infixes | 2017-01-04 00:47:42 +01:00 |  | 
			
				
					| 
							
							
								 Ines Montani | 59059fed27 | Move regression test for #351 to own file | 2017-01-04 00:47:11 +01:00 |  | 
			
				
					| 
							
							
								 Ines Montani | 667051375d | Modernize tokenizer tests for whitespace | 2017-01-04 00:46:35 +01:00 |  | 
			
				
					| 
							
							
								 Ines Montani | aafc894285 | Modernize tokenizer tests for contractions Use @pytest.mark.parametrize. | 2017-01-03 23:02:21 +01:00 |  | 
			
				
					| 
							
							
								 Ines Montani | 1d237664af | Add lowercase lemma to tokenizer exceptions | 2017-01-03 23:02:21 +01:00 |  | 
			
				
					| 
							
							
								 Ines Montani | dd7cd44ba5 | Update README.rst | 2017-01-03 21:27:25 +01:00 |  | 
			
				
					| 
							
							
								 Ines Montani | d677db6277 | Change "Multi-language support" to amber for spaCy | 2017-01-03 21:24:35 +01:00 |  | 
			
				
					| 
							
							
								 Ines Montani | 6f51609b5e | Use yellow color for neutral pro/con icon | 2017-01-03 21:24:14 +01:00 |  | 
			
				
					| 
							
							
								 Ines Montani | 84a87951eb | Fix typos | 2017-01-03 18:27:43 +01:00 |  | 
			
				
					| 
							
							
								 Ines Montani | 35b39f53c3 | Reorganise English tokenizer exceptions (as discussed in #718) Add logic to generate exceptions that follow a consistent pattern (like
verbs and pronouns) and allow certain tokens to be excluded explicitly. | 2017-01-03 18:26:09 +01:00 |  | 
			
				
					| 
							
							
								 Ines Montani | fb9d3bb022 | Revert "Merge remote-tracking branch 'origin/master'" This reverts commit d3b181cdf1, reversing
changes made tob19cfcc144. | 2017-01-03 18:21:36 +01:00 |  | 
			
				
					| 
							
							
								 Ines Montani | 461cbb99d8 | Revert "Reorganise English tokenizer exceptions (as discussed in #718)" This reverts commit b19cfcc144. | 2017-01-03 18:21:29 +01:00 |  | 
			
				
					| 
							
							
								 Ines Montani | d3b181cdf1 | Merge remote-tracking branch 'origin/master' # Conflicts:
#	spacy/en/tokenizer_exceptions.py | 2017-01-03 18:20:01 +01:00 |  | 
			
				
					| 
							
							
								 Ines Montani | b19cfcc144 | Reorganise English tokenizer exceptions (as discussed in #718) Add logic to generate exceptions that follow a consistent pattern (like
verbs and pronouns) and allow certain tokens to be excluded explicitly. | 2017-01-03 18:17:57 +01:00 |  | 
			
				
					| 
							
							
								 Ines Montani | 4fc4d3d0e3 | Update PULL_REQUEST_TEMPLATE.md | 2017-01-03 15:41:16 +01:00 |  |