| 
							
							
								 ines | 10e29189ac | Adjust URL testcases and xfail problems (instead of comment) | 2017-03-10 14:22:50 +01:00 |  | 
			
				
					| 
							
							
								 Dan Rapp | 123d3f2d38 | Fix error in test case parameterization | 2017-03-09 12:18:21 -07:00 |  | 
			
				
					| 
							
							
								 Dan Rapp | b9307dfcd7 | Merge branch 'master' into rappdw/tokenizer_exceptions_url_fix | 2017-03-09 11:42:14 -07:00 |  | 
			
				
					| 
							
							
								 Dan Rapp | 3b1df3808d | Issue #840 - URL pattenr too broad | 2017-03-09 11:39:39 -07:00 |  | 
			
				
					| 
							
							
								 Aniruddha Adhikary | 696215a3fb | add tests for Bengali | 2017-03-05 11:25:12 +06:00 |  | 
			
				
					| 
							
							
								 Ines Montani | 138c53ff2e | Merge tokenizer tests | 2017-01-13 01:34:14 +01:00 |  | 
			
				
					| 
							
							
								 Ines Montani | 33e5f8dc2e | Create basic and extended test set for URLs | 2017-01-12 23:40:02 +01:00 |  | 
			
				
					| 
							
							
								 Ines Montani | ae7edd30e7 | Move text file back to tokenizer tests directory | 2017-01-12 02:10:23 +01:00 |  | 
			
				
					| 
							
							
								 Ines Montani | c682b8ca90 | Merge conftests into one cohesive file | 2017-01-11 13:56:32 +01:00 |  | 
			
				
					| 
							
							
								 Ines Montani | 869963c3c4 | Mark extensive prefix/suffix tests as slow | 2017-01-10 15:57:35 +01:00 |  | 
			
				
					| 
							
							
								 Ines Montani | 487e020ebe | Add simple test for surrounding brackets | 2017-01-10 15:57:26 +01:00 |  | 
			
				
					| 
							
							
								 Ines Montani | 0ba5cf51d2 | Assert length first | 2017-01-10 15:57:00 +01:00 |  | 
			
				
					| 
							
							
								 Ines Montani | 2185d31907 | Adjust names and formatting | 2017-01-10 15:56:35 +01:00 |  | 
			
				
					| 
							
							
								 Ines Montani | e10d4ca964 | Remove semi-redundant URLs and punctuation for faster testing | 2017-01-10 15:54:25 +01:00 |  | 
			
				
					| 
							
							
								 Ines Montani | 3a3cb2c90c | Add unicode declaration | 2017-01-10 15:53:15 +01:00 |  | 
			
				
					| 
							
							
								 Matthew Honnibal | 42cd598f57 | Use correct fixtures in URL tokenizer | 2017-01-09 14:10:40 +01:00 |  | 
			
				
					| 
							
							
								 Ines Montani | aa876884f0 | Revert "Revert "Merge remote-tracking branch 'origin/master'"" This reverts commit fb9d3bb022. | 2017-01-09 13:28:13 +01:00 |  | 
			
				
					| 
							
							
								 Ines Montani | abb09782f9 | Move sun.txt to original location and fix path to not break parser tests | 2017-01-08 20:32:54 +01:00 |  | 
			
				
					| 
							
							
								 Ines Montani | bbe7cab3a1 | Move non-English-specific tests back to general tokenizer tests | 2017-01-05 18:09:29 +01:00 |  | 
			
				
					| 
							
							
								 Ines Montani | 637f785036 | Add general sanity tests for all tokenizers | 2017-01-05 16:25:38 +01:00 |  | 
			
				
					| 
							
							
								 Ines Montani | c5f2dc15de | Move English tokenizer tests to directory /en | 2017-01-05 16:25:04 +01:00 |  | 
			
				
					| 
							
							
								 Ines Montani | 8b45363b4d | Modernize and merge general tokenizer tests | 2017-01-05 13:17:05 +01:00 |  | 
			
				
					| 
							
							
								 Ines Montani | 02cfda48c9 | Modernize and merge tokenizer tests for string loading | 2017-01-05 13:16:55 +01:00 |  | 
			
				
					| 
							
							
								 Ines Montani | a11f684822 | Modernize and merge tokenizer tests for whitespace | 2017-01-05 13:16:33 +01:00 |  | 
			
				
					| 
							
							
								 Ines Montani | 8b284fc6f1 | Modernize and merge tokenizer tests for text from file | 2017-01-05 13:15:52 +01:00 |  | 
			
				
					| 
							
							
								 Ines Montani | 2c2e878653 | Modernize and merge tokenizer tests for punctuation | 2017-01-05 13:14:16 +01:00 |  | 
			
				
					| 
							
							
								 Ines Montani | 8a74129cdf | Modernize and merge tokenizer tests for prefixes/suffixes/infixes | 2017-01-05 13:13:12 +01:00 |  | 
			
				
					| 
							
							
								 Ines Montani | 0e65dca9a5 | Modernize and merge tokenizer tests for exception and emoticons | 2017-01-05 13:11:31 +01:00 |  | 
			
				
					| 
							
							
								 Ines Montani | 34c47bb20d | Fix formatting | 2017-01-05 13:10:51 +01:00 |  | 
			
				
					| 
							
							
								 Ines Montani | 2e72683baa | Add missing docstrings | 2017-01-05 13:10:21 +01:00 |  | 
			
				
					| 
							
							
								 Ines Montani | da10a049a6 | Add unicode declarations | 2017-01-05 13:09:48 +01:00 |  | 
			
				
					| 
							
							
								 Ines Montani | 8279993a6f | Modernize and merge tokenizer tests for punctuation | 2017-01-04 00:49:20 +01:00 |  | 
			
				
					| 
							
							
								 Ines Montani | 550630df73 | Update tokenizer tests for contractions | 2017-01-04 00:48:42 +01:00 |  | 
			
				
					| 
							
							
								 Ines Montani | 109f202e8f | Update conftest fixture | 2017-01-04 00:48:21 +01:00 |  | 
			
				
					| 
							
							
								 Ines Montani | ee6b49b293 | Modernize tokenizer tests for emoticons | 2017-01-04 00:47:59 +01:00 |  | 
			
				
					| 
							
							
								 Ines Montani | f09b5a5dfd | Modernize tokenizer tests for infixes | 2017-01-04 00:47:42 +01:00 |  | 
			
				
					| 
							
							
								 Ines Montani | 59059fed27 | Move regression test for #351 to own file | 2017-01-04 00:47:11 +01:00 |  | 
			
				
					| 
							
							
								 Ines Montani | 667051375d | Modernize tokenizer tests for whitespace | 2017-01-04 00:46:35 +01:00 |  | 
			
				
					| 
							
							
								 Ines Montani | aafc894285 | Modernize tokenizer tests for contractions Use @pytest.mark.parametrize. | 2017-01-03 23:02:21 +01:00 |  | 
			
				
					| 
							
							
								 Ines Montani | fb9d3bb022 | Revert "Merge remote-tracking branch 'origin/master'" This reverts commit d3b181cdf1, reversing
changes made tob19cfcc144. | 2017-01-03 18:21:36 +01:00 |  | 
			
				
					| 
							
							
								 Matthew Honnibal | 3ba7c167a8 | Fix URL tests | 2016-12-30 17:10:08 -06:00 |  | 
			
				
					| 
							
							
								 Matthew Honnibal | 3e8d9c772e | Test interaction of token_match and punctuation Check that the new token_match function applies after punctuation is split off. | 2016-12-31 00:52:17 +11:00 |  | 
			
				
					| 
							
							
								 Gyorgy Orosz | 1748549aeb | Added exception pattern mechanism to the tokenizer. | 2016-12-21 23:16:19 +01:00 |  | 
			
				
					| 
							
							
								 Ines Montani | d1c1d3f9cd | Fix tokenizer test | 2016-12-18 16:55:32 +01:00 |  | 
			
				
					| 
							
							
								 Ines Montani | 07f0efb102 | Add test for tokenizer regular expressions | 2016-12-07 20:33:28 +01:00 |  | 
			
				
					| 
							
							
								 Matthew Honnibal | b6b01d4680 | Remove deprecated tokens_from_list test. | 2016-11-02 23:47:21 +01:00 |  | 
			
				
					| 
							
							
								 Matthew Honnibal | cc8bf62208 | * Fix Issue #360: Tokenizer failed when the infix regex matched the start of the string while trying to tokenize multi-infix tokens. | 2016-05-09 13:23:47 +02:00 |  | 
			
				
					| 
							
							
								 Matthew Honnibal | b4bfc6ae55 | * Add test for Issue #351: Indices off when leading whitespace | 2016-05-04 15:53:17 +02:00 |  | 
			
				
					| 
							
							
								 Matthew Honnibal | 6f82065761 | * Fix infixed commas in tokenizer, re Issue #326. Need to benchmark on empirical data, to make sure this doesn't break other cases. | 2016-04-14 11:36:03 +02:00 |  | 
			
				
					| 
							
							
								 Matthew Honnibal | 04d0209be9 | * Recognise multiple infixes in a token. | 2016-04-13 18:38:26 +10:00 |  |