| 
							
							
								 Raphaël Bournhonesque | 85f951ca99 | Add tokenizer exceptions for French | 2017-02-02 08:36:16 +01:00 |  | 
			
				
					| 
							
							
								 Ines Montani | e4875834fe | Fix formatting | 2017-01-31 15:19:33 +01:00 |  | 
			
				
					| 
							
							
								 Ines Montani | c304834e45 | Add missing import | 2017-01-31 15:18:30 +01:00 |  | 
			
				
					| 
							
							
								 Ines Montani | e6465b9ca3 | Parametrize test cases and mark as xfail | 2017-01-31 15:14:42 +01:00 |  | 
			
				
					| 
							
							
								 latkins | e4c84321a5 | Added regression test for Issue #792. | 2017-01-31 13:47:42 +00:00 |  | 
			
				
					| 
							
							
								 Ines Montani | 19501f3340 | Add regression test for #775 | 2017-01-25 13:16:52 +01:00 |  | 
			
				
					| 
							
							
								 Raphaël Bournhonesque | 1be9c0e724 | Add fr tokenization unit tests | 2017-01-24 10:57:37 +01:00 |  | 
			
				
					| 
							
							
								 Ines Montani | 0967eb07be | Add regression test for #768 | 2017-01-23 21:25:46 +01:00 |  | 
			
				
					| 
							
							
								 Ines Montani | 5f6f48e734 | Add regression test for #759 | 2017-01-20 15:11:48 +01:00 |  | 
			
				
					| 
							
							
								 Ines Montani | d704cfa60d | Fix typo | 2017-01-16 21:30:33 +01:00 |  | 
			
				
					| 
							
							
								 Matthew Honnibal | 2c60d0cb1e | Test #743: Tokens unhashable. | 2017-01-16 13:27:26 +01:00 |  | 
			
				
					| 
							
							
								 Ines Montani | 50878ef598 | Exclude "were" and "Were" from tokenizer exceptions and add regression test (resolves #744) | 2017-01-16 13:10:38 +01:00 |  | 
			
				
					| 
							
							
								 Ines Montani | e053c7693b | Fix formatting | 2017-01-16 13:09:52 +01:00 |  | 
			
				
					| 
							
							
								 Ines Montani | 116c675c3c | Merge pull request #742 from oroszgy/hu_tokenizer_fix Improved Hungarian tokenizer | 2017-01-14 23:52:44 +01:00 |  | 
			
				
					| 
							
							
								 Gyorgy Orosz | 92345b6a41 | Further numeric test. | 2017-01-14 22:44:19 +01:00 |  | 
			
				
					| 
							
							
								 Gyorgy Orosz | b4df202bfa | Better error handling | 2017-01-14 22:24:58 +01:00 |  | 
			
				
					| 
							
							
								 Gyorgy Orosz | b03a46792c | Better error handling | 2017-01-14 22:09:29 +01:00 |  | 
			
				
					| 
							
							
								 Ines Montani | 332ce2d758 | Update README.md | 2017-01-14 21:12:11 +01:00 |  | 
			
				
					| 
							
							
								 Gyorgy Orosz | 9505c6a72b | Passing all old tests. | 2017-01-14 20:39:21 +01:00 |  | 
			
				
					| 
							
							
								 Gyorgy Orosz | 63037e79af | Fixed hyphen handling in the Hungarian tokenizer. | 2017-01-14 16:30:11 +01:00 |  | 
			
				
					| 
							
							
								 Gyorgy Orosz | f77c0284d6 | Maintaining compatibility with other spacy tokenizers. | 2017-01-14 16:19:15 +01:00 |  | 
			
				
					| 
							
							
								 Gyorgy Orosz | 1be5da1ac6 | Fixed Hungarian tokenizer for numbers | 2017-01-14 15:51:59 +01:00 |  | 
			
				
					| 
							
							
								 Ines Montani | a89e269a5a | Fix test formatting and consistency | 2017-01-14 13:41:19 +01:00 |  | 
			
				
					| 
							
							
								 Ines Montani | 3424e3a7e5 | Update README.md | 2017-01-13 15:54:54 +01:00 |  | 
			
				
					| 
							
							
								 Ines Montani | 49186b34a1 | Mark lemmatizer tests as models since they use installed data | 2017-01-13 15:12:07 +01:00 |  | 
			
				
					| 
							
							
								 Ines Montani | 138deb80a1 | Modernise vector tests, use add_vecs_to_vocab and don't depend on models | 2017-01-13 15:12:07 +01:00 |  | 
			
				
					| 
							
							
								 Ines Montani | 96f0caa28a | Fix test name for consistency | 2017-01-13 15:12:07 +01:00 |  | 
			
				
					| 
							
							
								 Ines Montani | dc2bb1259f | Add util function to add vectors to vocab | 2017-01-13 15:12:07 +01:00 |  | 
			
				
					| 
							
							
								 Ines Montani | db9b25663d | Reformat add_docs_equal and add docstring | 2017-01-13 15:12:07 +01:00 |  | 
			
				
					| 
							
							
								 Ines Montani | 62ce0a0073 | Add README.md to tests to explain organisation and conventions | 2017-01-13 15:11:18 +01:00 |  | 
			
				
					| 
							
							
								 Ines Montani | 38d60f6b90 | Modernise serializer I/O tests and don't depend on models where possible | 2017-01-13 02:24:56 +01:00 |  | 
			
				
					| 
							
							
								 Ines Montani | 4bb5b89ee4 | Add text_file_b fixture using BytesIO | 2017-01-13 02:23:50 +01:00 |  | 
			
				
					| 
							
							
								 Ines Montani | 49febd8c62 | Modernise noun chunks tests and don't depend on models | 2017-01-13 02:01:00 +01:00 |  | 
			
				
					| 
							
							
								 Ines Montani | 3ee97b5686 | Rename test_parser to test_noun_chunks | 2017-01-13 01:36:33 +01:00 |  | 
			
				
					| 
							
							
								 Ines Montani | a308703f47 | Remove old tests | 2017-01-13 01:34:48 +01:00 |  | 
			
				
					| 
							
							
								 Ines Montani | 12eb8edf26 | Move parser tests from unit to parser | 2017-01-13 01:34:38 +01:00 |  | 
			
				
					| 
							
							
								 Ines Montani | 138c53ff2e | Merge tokenizer tests | 2017-01-13 01:34:14 +01:00 |  | 
			
				
					| 
							
							
								 Ines Montani | 01f36ca3ff | Move attrs tests from unit to root and modernise | 2017-01-13 01:33:50 +01:00 |  | 
			
				
					| 
							
							
								 Ines Montani | 3610d27967 | Move alignment tests from munge to gold and modernise | 2017-01-13 01:33:31 +01:00 |  | 
			
				
					| 
							
							
								 Ines Montani | 094ff7396a | Reformat and rename Pragmatic Segmenter tests and mark xfails | 2017-01-13 01:30:20 +01:00 |  | 
			
				
					| 
							
							
								 Ines Montani | affcf1b19d | Modernise lemmatizer tests | 2017-01-12 23:41:17 +01:00 |  | 
			
				
					| 
							
							
								 Ines Montani | 33d9cf87f9 | Modernise tagger tests and fix xpassing test | 2017-01-12 23:40:52 +01:00 |  | 
			
				
					| 
							
							
								 Ines Montani | 33e5f8dc2e | Create basic and extended test set for URLs | 2017-01-12 23:40:02 +01:00 |  | 
			
				
					| 
							
							
								 Ines Montani | 5e4f5ebfc8 | Modernise BILUO tests | 2017-01-12 23:39:18 +01:00 |  | 
			
				
					| 
							
							
								 Ines Montani | 09acfbca01 | Add Lemmatizer fixture | 2017-01-12 23:38:55 +01:00 |  | 
			
				
					| 
							
							
								 Ines Montani | 514bfa2597 | Add path fixture for spaCy data path | 2017-01-12 23:38:47 +01:00 |  | 
			
				
					| 
							
							
								 Ines Montani | e9e99a5670 | Add regression test for #740 | 2017-01-12 22:57:38 +01:00 |  | 
			
				
					| 
							
							
								 Ines Montani | 6935d55409 | Fix formatting | 2017-01-12 22:56:20 +01:00 |  | 
			
				
					| 
							
							
								 Ines Montani | 5f0d196a31 | Modernise and merge matcher tests | 2017-01-12 22:23:11 +01:00 |  | 
			
				
					| 
							
							
								 Ines Montani | d5d774413a | Update comments on EN and DE fixtures | 2017-01-12 22:03:07 +01:00 |  | 
			
				
					| 
							
							
								 Ines Montani | 9b4bea1df9 | Tidy up and rename regression tests and remove unnecessary imports | 2017-01-12 22:00:37 +01:00 |  | 
			
				
					| 
							
							
								 Ines Montani | 5e1b6178e3 | Fix formatting and consistency | 2017-01-12 22:00:06 +01:00 |  | 
			
				
					| 
							
							
								 Ines Montani | a3fd32455e | Remove redundant language loading integration tests | 2017-01-12 21:59:48 +01:00 |  | 
			
				
					| 
							
							
								 Ines Montani | 61f1ca09c2 | Modernise serializer codecs tests | 2017-01-12 21:58:55 +01:00 |  | 
			
				
					| 
							
							
								 Ines Montani | 5dbc6e59f6 | Modernise Huffman tests | 2017-01-12 21:58:40 +01:00 |  | 
			
				
					| 
							
							
								 Ines Montani | edeeeccea5 | Modernise packer tests and don't depend on models where possible | 2017-01-12 21:58:07 +01:00 |  | 
			
				
					| 
							
							
								 Ines Montani | d084676cd0 | Modernise and merge serialization tests | 2017-01-12 21:57:19 +01:00 |  | 
			
				
					| 
							
							
								 Ines Montani | 442237787c | Add assert_docs_equal util to compare two docs | 2017-01-12 21:56:52 +01:00 |  | 
			
				
					| 
							
							
								 Ines Montani | eac3f700fb | Add fixture for entity recognizer | 2017-01-12 21:56:32 +01:00 |  | 
			
				
					| 
							
							
								 Ines Montani | b438cfddbc | Modernise matcher tests and split into two files | 2017-01-12 17:51:46 +01:00 |  | 
			
				
					| 
							
							
								 Ines Montani | 27482ebed8 | Move matcher tests for #188 and #242 to regression tests Modernise tests and remove unnecessary imports | 2017-01-12 17:33:57 +01:00 |  | 
			
				
					| 
							
							
								 Ines Montani | 0a4dc632bd | Update test to not create redundant Doc object | 2017-01-12 17:33:18 +01:00 |  | 
			
				
					| 
							
							
								 Ines Montani | a2526e66d8 | Fix formatting, naming and unicode declaration | 2017-01-12 16:51:13 +01:00 |  | 
			
				
					| 
							
							
								 Ines Montani | 052cdff07d | Modernise vector similarity tests | 2017-01-12 16:51:13 +01:00 |  | 
			
				
					| 
							
							
								 Ines Montani | bd20ec0a6a | Add get_cosine util function | 2017-01-12 16:51:13 +01:00 |  | 
			
				
					| 
							
							
								 Ines Montani | 51ef75f629 | Fix regression test for #615 and remove unnecessary imports | 2017-01-12 16:51:12 +01:00 |  | 
			
				
					| 
							
							
								 Ines Montani | aeb747e10c | Adjust formatting | 2017-01-12 16:51:12 +01:00 |  | 
			
				
					| 
							
							
								 Ines Montani | 8e3e58a7e6 | Modernise and merge lexeme vocab tests | 2017-01-12 16:51:12 +01:00 |  | 
			
				
					| 
							
							
								 Ines Montani | c3d4516fc2 | Move test for #361 to regression tests | 2017-01-12 16:51:12 +01:00 |  | 
			
				
					| 
							
							
								 Ines Montani | 7cb3d74426 | Modernise span tests and don't depend on models | 2017-01-12 15:30:49 +01:00 |  | 
			
				
					| 
							
							
								 Ines Montani | 92e3d8b3ee | Modernise vocab API tests and remove old xfailing tests | 2017-01-12 15:27:46 +01:00 |  | 
			
				
					| 
							
							
								 Ines Montani | 7ea87684cd | Rename test_vocab.py to test_vocab_api.py | 2017-01-12 15:12:21 +01:00 |  | 
			
				
					| 
							
							
								 Ines Montani | 0da2ee5c68 | Merge flag features tests into orth tests in tests root | 2017-01-12 15:12:00 +01:00 |  | 
			
				
					| 
							
							
								 Ines Montani | 03c136cfd3 | Remove StringStore tests from vocab tests | 2017-01-12 15:11:15 +01:00 |  | 
			
				
					| 
							
							
								 Ines Montani | d7bd57abdf | Modernise add vectors vocab test | 2017-01-12 15:09:49 +01:00 |  | 
			
				
					| 
							
							
								 Ines Montani | 89525ef345 | Use consistent test names | 2017-01-12 15:09:21 +01:00 |  | 
			
				
					| 
							
							
								 Ines Montani | f8803808ce | Remove old unused tests and conftest files | 2017-01-12 15:09:05 +01:00 |  | 
			
				
					| 
							
							
								 Ines Montani | 4d0bfebcd9 | Move Pragmatic Segmenter test cases (currently unused) to parser tests | 2017-01-12 15:08:02 +01:00 |  | 
			
				
					| 
							
							
								 Ines Montani | 26d018d874 | Add tests for StringStore | 2017-01-12 15:07:31 +01:00 |  | 
			
				
					| 
							
							
								 Ines Montani | 9b6784bab5 | Add fixture for StringStore | 2017-01-12 15:05:40 +01:00 |  | 
			
				
					| 
							
							
								 Ines Montani | 99d66d613a | Modernise tests for merging spans and don't depend on models | 2017-01-12 12:26:26 +01:00 |  | 
			
				
					| 
							
							
								 Ines Montani | fa8f67596d | Remove unused old test | 2017-01-12 12:26:08 +01:00 |  | 
			
				
					| 
							
							
								 Ines Montani | 359f73a96b | Move test for #54 to regression tests | 2017-01-12 12:25:51 +01:00 |  | 
			
				
					| 
							
							
								 Ines Montani | 3f3a46722c | Remove unused conftest | 2017-01-12 12:25:24 +01:00 |  | 
			
				
					| 
							
							
								 Ines Montani | c2406e92bc | Allow setting ents in get_doc | 2017-01-12 12:25:10 +01:00 |  | 
			
				
					| 
							
							
								 Ines Montani | c5914c6fe5 | Fix and pass regression test for #736 | 2017-01-12 11:48:56 +01:00 |  | 
			
				
					| 
							
							
								 Ines Montani | a6790b6694 | Rename tags to pos in get_doc and allow adding tags to tokens | 2017-01-12 11:18:36 +01:00 |  | 
			
				
					| 
							
							
								 Ines Montani | 1add8ace67 | Merge lemmatizer tests | 2017-01-12 11:16:53 +01:00 |  | 
			
				
					| 
							
							
								 Ines Montani | 3bc082abdf | Modernise morph exceptions test and don't depend on models | 2017-01-12 11:14:29 +01:00 |  | 
			
				
					| 
							
							
								 Ines Montani | ec7739b76e | Add regression test for #736 | 2017-01-12 11:12:44 +01:00 |  | 
			
				
					| 
							
							
								 Ines Montani | 6c1c564891 | Move language-specific tests out of redundant tokenizer directories | 2017-01-12 02:17:18 +01:00 |  | 
			
				
					| 
							
							
								 Ines Montani | 8fecedac3a | Tidy up | 2017-01-12 02:16:37 +01:00 |  | 
			
				
					| 
							
							
								 Ines Montani | ae7edd30e7 | Move text file back to tokenizer tests directory | 2017-01-12 02:10:23 +01:00 |  | 
			
				
					| 
							
							
								 Ines Montani | ffcaba9017 | Remove old and/or redundant tests | 2017-01-12 02:10:18 +01:00 |  | 
			
				
					| 
							
							
								 Ines Montani | 19c4132097 | Modernise space attachment parser tests and don't depend on models | 2017-01-12 01:54:44 +01:00 |  | 
			
				
					| 
							
							
								 Ines Montani | 69778924c8 | Modernise and merge parser tests and don't depend on models | 2017-01-12 01:07:29 +01:00 |  | 
			
				
					| 
							
							
								 Ines Montani | 178c147612 | Modernise nonprojectivity tests and don't depend on models | 2017-01-12 01:06:36 +01:00 |  | 
			
				
					| 
							
							
								 Ines Montani | 1a3984742c | Modernise sentence boundary detection tests and don't depend on models (where possible) | 2017-01-11 23:53:08 +01:00 |  | 
			
				
					| 
							
							
								 Ines Montani | 0cdb6ea61d | Remove old unused pickle test | 2017-01-11 23:52:28 +01:00 |  | 
			
				
					| 
							
							
								 Ines Montani | c9671329dc | Move test for #309 to regression tests | 2017-01-11 23:52:13 +01:00 |  | 
			
				
					| 
							
							
								 Ines Montani | d0e37b5670 | Modernise parser tests and don't depend on models | 2017-01-11 21:30:27 +01:00 |  | 
			
				
					| 
							
							
								 Ines Montani | 342cb41782 | Add apply_transition_sequence util function to utils | 2017-01-11 21:30:14 +01:00 |  | 
			
				
					| 
							
							
								 Ines Montani | 09807addff | Add en_parser fixture | 2017-01-11 21:29:59 +01:00 |  | 
			
				
					| 
							
							
								 Ines Montani | 55d151aa61 | Modernise Doc parse tree navigation tests and don't depend on models | 2017-01-11 21:14:15 +01:00 |  | 
			
				
					| 
							
							
								 Ines Montani | 7262421bb2 | Use consistent test names | 2017-01-11 19:00:52 +01:00 |  | 
			
				
					| 
							
							
								 Ines Montani | 33800c9367 | Rename "tokens" tests to "doc" | 2017-01-11 18:59:01 +01:00 |  | 
			
				
					| 
							
							
								 Ines Montani | 3a9c6a9563 | Remove old unused files | 2017-01-11 18:58:38 +01:00 |  | 
			
				
					| 
							
							
								 Ines Montani | 8e962de39f | Remove old word vector tests | 2017-01-11 18:55:08 +01:00 |  | 
			
				
					| 
							
							
								 Ines Montani | e027936920 | Modernise Doc noun chunks tests | 2017-01-11 18:54:56 +01:00 |  | 
			
				
					| 
							
							
								 Ines Montani | 439f396acd | Modernise Doc array tests and don't depend on models | 2017-01-11 18:54:46 +01:00 |  | 
			
				
					| 
							
							
								 Ines Montani | 05447be884 | Modernise test for adding entities | 2017-01-11 18:54:24 +01:00 |  | 
			
				
					| 
							
							
								 Ines Montani | 6e883f4c00 | Modernise Doc API tests and don't depend on models | 2017-01-11 18:05:36 +01:00 |  | 
			
				
					| 
							
							
								 Ines Montani | 8bf3bb5c44 | Make words optional for get_doc | 2017-01-11 18:05:10 +01:00 |  | 
			
				
					| 
							
							
								 Ines Montani | 928db7e419 | Fix StringIO import for Python 3 | 2017-01-11 14:07:48 +01:00 |  | 
			
				
					| 
							
							
								 Ines Montani | 69998f216b | Rename test_tokens_api.py to test_doc_api.py | 2017-01-11 13:58:56 +01:00 |  | 
			
				
					| 
							
							
								 Ines Montani | d94dea1b18 | Merge token tests into token API tests | 2017-01-11 13:57:02 +01:00 |  | 
			
				
					| 
							
							
								 Ines Montani | eb23424ab0 | Modernise token API tests and don't depend on loading models | 2017-01-11 13:56:54 +01:00 |  | 
			
				
					| 
							
							
								 Ines Montani | c682b8ca90 | Merge conftests into one cohesive file | 2017-01-11 13:56:32 +01:00 |  | 
			
				
					| 
							
							
								 Ines Montani | 909f24d7df | Add test utils and get_doc helper function Create Doc object from given vocab, words and annotations to allow
tests not to depend on loading the models. | 2017-01-11 13:55:33 +01:00 |  | 
			
				
					| 
							
							
								 Ines Montani | 3e6e1f0251 | Tidy up regression tests | 2017-01-10 19:24:10 +01:00 |  | 
			
				
					| 
							
							
								 Ines Montani | 869963c3c4 | Mark extensive prefix/suffix tests as slow | 2017-01-10 15:57:35 +01:00 |  | 
			
				
					| 
							
							
								 Ines Montani | 487e020ebe | Add simple test for surrounding brackets | 2017-01-10 15:57:26 +01:00 |  | 
			
				
					| 
							
							
								 Ines Montani | 0ba5cf51d2 | Assert length first | 2017-01-10 15:57:00 +01:00 |  | 
			
				
					| 
							
							
								 Ines Montani | 2185d31907 | Adjust names and formatting | 2017-01-10 15:56:35 +01:00 |  | 
			
				
					| 
							
							
								 Ines Montani | e10d4ca964 | Remove semi-redundant URLs and punctuation for faster testing | 2017-01-10 15:54:25 +01:00 |  | 
			
				
					| 
							
							
								 Ines Montani | 3a3cb2c90c | Add unicode declaration | 2017-01-10 15:53:15 +01:00 |  | 
			
				
					| 
							
							
								 Matthew Honnibal | 64f747cb65 | Token comparison test | 2017-01-09 19:12:00 +01:00 |  | 
			
				
					| 
							
							
								 Matthew Honnibal | 18c3c2d05c | Add tests for token comparison, re Issue #631 | 2017-01-09 19:09:59 +01:00 |  | 
			
				
					| 
							
							
								 Matthew Honnibal | 42cd598f57 | Use correct fixtures in URL tokenizer | 2017-01-09 14:10:40 +01:00 |  | 
			
				
					| 
							
							
								 Ines Montani | aa876884f0 | Revert "Revert "Merge remote-tracking branch 'origin/master'"" This reverts commit fb9d3bb022. | 2017-01-09 13:28:13 +01:00 |  | 
			
				
					| 
							
							
								 Ines Montani | d5c72c40eb | Remove old tests for old website example code | 2017-01-08 22:28:53 +01:00 |  | 
			
				
					| 
							
							
								 Ines Montani | 5d28664fc5 | Don't test Hungarian for numbers and hyphens for now Reinvestigate behaviour of case affixes given reorganised tokenizer
patterns. | 2017-01-08 20:45:40 +01:00 |  | 
			
				
					| 
							
							
								 Ines Montani | abb09782f9 | Move sun.txt to original location and fix path to not break parser tests | 2017-01-08 20:32:54 +01:00 |  | 
			
				
					| 
							
							
								 Ines Montani | 8328925e1f | Add newlines to long German text | 2017-01-05 18:13:30 +01:00 |  | 
			
				
					| 
							
							
								 Ines Montani | 55b46d7cf6 | Add tokenizer tests for German | 2017-01-05 18:11:25 +01:00 |  | 
			
				
					| 
							
							
								 Ines Montani | 5bb4081f52 | Remove redundant test_tokenizer.py for English | 2017-01-05 18:11:11 +01:00 |  | 
			
				
					| 
							
							
								 Ines Montani | 8216ba599b | Add tests for longer and mixed English texts | 2017-01-05 18:11:04 +01:00 |  | 
			
				
					| 
							
							
								 Ines Montani | 65f937d5c6 | Move basic contraction tests to test_contractions.py | 2017-01-05 18:09:53 +01:00 |  | 
			
				
					| 
							
							
								 Ines Montani | bbe7cab3a1 | Move non-English-specific tests back to general tokenizer tests | 2017-01-05 18:09:29 +01:00 |  | 
			
				
					| 
							
							
								 Ines Montani | 038002d616 | Reformat HU tokenizer tests and adapt to general style Improve readability of test cases and add conftest.py with fixture | 2017-01-05 18:06:44 +01:00 |  | 
			
				
					| 
							
							
								 Ines Montani | 637f785036 | Add general sanity tests for all tokenizers | 2017-01-05 16:25:38 +01:00 |  | 
			
				
					| 
							
							
								 Ines Montani | c5f2dc15de | Move English tokenizer tests to directory /en | 2017-01-05 16:25:04 +01:00 |  | 
			
				
					| 
							
							
								 Ines Montani | 8b45363b4d | Modernize and merge general tokenizer tests | 2017-01-05 13:17:05 +01:00 |  | 
			
				
					| 
							
							
								 Ines Montani | 02cfda48c9 | Modernize and merge tokenizer tests for string loading | 2017-01-05 13:16:55 +01:00 |  | 
			
				
					| 
							
							
								 Ines Montani | a11f684822 | Modernize and merge tokenizer tests for whitespace | 2017-01-05 13:16:33 +01:00 |  | 
			
				
					| 
							
							
								 Ines Montani | 8b284fc6f1 | Modernize and merge tokenizer tests for text from file | 2017-01-05 13:15:52 +01:00 |  | 
			
				
					| 
							
							
								 Ines Montani | 2c2e878653 | Modernize and merge tokenizer tests for punctuation | 2017-01-05 13:14:16 +01:00 |  | 
			
				
					| 
							
							
								 Ines Montani | 8a74129cdf | Modernize and merge tokenizer tests for prefixes/suffixes/infixes | 2017-01-05 13:13:12 +01:00 |  | 
			
				
					| 
							
							
								 Ines Montani | 0e65dca9a5 | Modernize and merge tokenizer tests for exception and emoticons | 2017-01-05 13:11:31 +01:00 |  | 
			
				
					| 
							
							
								 Ines Montani | 34c47bb20d | Fix formatting | 2017-01-05 13:10:51 +01:00 |  | 
			
				
					| 
							
							
								 Ines Montani | 2e72683baa | Add missing docstrings | 2017-01-05 13:10:21 +01:00 |  | 
			
				
					| 
							
							
								 Ines Montani | da10a049a6 | Add unicode declarations | 2017-01-05 13:09:48 +01:00 |  | 
			
				
					| 
							
							
								 Ines Montani | 58adae8774 | Remove unused file | 2017-01-05 13:09:22 +01:00 |  | 
			
				
					| 
							
							
								 Ines Montani | c6e5a5349d | Move regression test for #360 into own file | 2017-01-04 00:49:31 +01:00 |  | 
			
				
					| 
							
							
								 Ines Montani | 8279993a6f | Modernize and merge tokenizer tests for punctuation | 2017-01-04 00:49:20 +01:00 |  | 
			
				
					| 
							
							
								 Ines Montani | 550630df73 | Update tokenizer tests for contractions | 2017-01-04 00:48:42 +01:00 |  | 
			
				
					| 
							
							
								 Ines Montani | 109f202e8f | Update conftest fixture | 2017-01-04 00:48:21 +01:00 |  | 
			
				
					| 
							
							
								 Ines Montani | ee6b49b293 | Modernize tokenizer tests for emoticons | 2017-01-04 00:47:59 +01:00 |  | 
			
				
					| 
							
							
								 Ines Montani | f09b5a5dfd | Modernize tokenizer tests for infixes | 2017-01-04 00:47:42 +01:00 |  | 
			
				
					| 
							
							
								 Ines Montani | 59059fed27 | Move regression test for #351 to own file | 2017-01-04 00:47:11 +01:00 |  | 
			
				
					| 
							
							
								 Ines Montani | 667051375d | Modernize tokenizer tests for whitespace | 2017-01-04 00:46:35 +01:00 |  | 
			
				
					| 
							
							
								 Ines Montani | aafc894285 | Modernize tokenizer tests for contractions Use @pytest.mark.parametrize. | 2017-01-03 23:02:21 +01:00 |  | 
			
				
					| 
							
							
								 Ines Montani | fb9d3bb022 | Revert "Merge remote-tracking branch 'origin/master'" This reverts commit d3b181cdf1, reversing
changes made tob19cfcc144. | 2017-01-03 18:21:36 +01:00 |  | 
			
				
					| 
							
							
								 Matthew Honnibal | 3ba7c167a8 | Fix URL tests | 2016-12-30 17:10:08 -06:00 |  | 
			
				
					| 
							
							
								 Matthew Honnibal | 9936a1b9b5 | Merge branch 'tokenization_w_exception_patterns' of https://github.com/oroszgy/spaCy.hu into oroszgy-tokenization_w_exception_patterns | 2016-12-30 14:53:40 -06:00 |  | 
			
				
					| 
							
							
								 kengz | 73a38bd4d1 | Merge remote-tracking branch 'upstream/master' | 2016-12-30 12:19:59 -05:00 |  | 
			
				
					| 
							
							
								 kengz | da44183ae1 | move parse_tree logic to a new tokens/printers.py file | 2016-12-30 12:19:18 -05:00 |  | 
			
				
					| 
							
							
								 Matthew Honnibal | 3e8d9c772e | Test interaction of token_match and punctuation Check that the new token_match function applies after punctuation is split off. | 2016-12-31 00:52:17 +11:00 |  | 
			
				
					| 
							
							
								 Gyorgy Orosz | 45e045a87b | Unicode/UTF8 compatibility for Python2 | 2016-12-24 00:21:00 +01:00 |  | 
			
				
					| 
							
							
								 Gyorgy Orosz | 72b61b6d03 | Typo fix. | 2016-12-24 00:10:29 +01:00 |  | 
			
				
					| 
							
							
								 Gyorgy Orosz | 1748549aeb | Added exception pattern mechanism to the tokenizer. | 2016-12-21 23:16:19 +01:00 |  | 
			
				
					| 
							
							
								 Gyorgy Orosz | ab2f6ea46c | Removed data files from tests.. | 2016-12-21 20:22:09 +01:00 |  | 
			
				
					| 
							
							
								 Gyorgy Orosz | 3d5306acb9 | Added further testcases. | 2016-12-20 23:49:35 +01:00 |  | 
			
				
					| 
							
							
								 Gyorgy Orosz | 23956e72ff | Improved partial support for tokenzing Hungarian numbers | 2016-12-20 23:36:59 +01:00 |  | 
			
				
					| 
							
							
								 Gyorgy Orosz | 6add156075 | Refactored language data structure | 2016-12-20 22:28:20 +01:00 |  | 
			
				
					| 
							
							
								 Gyorgy Orosz | 366b3f8685 | Merge branch 'master' into hu_tokenizer | 2016-12-20 20:53:31 +01:00 |  | 
			
				
					| 
							
							
								 Gyorgy Orosz | c035928156 | Partial Hungarian number tokenization is added. | 2016-12-20 20:46:20 +01:00 |  | 
			
				
					| 
							
							
								 Matthew Honnibal | f38eb25fe1 | Fix test for word vector | 2016-12-18 23:31:55 +01:00 |  | 
			
				
					| 
							
							
								 Matthew Honnibal | e4c951c153 | Merge branch 'organize-language-data' of ssh://github.com/explosion/spaCy into organize-language-data | 2016-12-18 17:01:08 +01:00 |  | 
			
				
					| 
							
							
								 Ines Montani | d1c1d3f9cd | Fix tokenizer test | 2016-12-18 16:55:32 +01:00 |  | 
			
				
					| 
							
							
								 Matthew Honnibal | bdcecb3c96 | Add import in regression test | 2016-12-18 16:51:31 +01:00 |  | 
			
				
					| 
							
							
								 Ines Montani | 77cf2fb0f6 | Remove unnecessary argument in test | 2016-12-18 14:06:27 +01:00 |  | 
			
				
					| 
							
							
								 Ines Montani | 121c310566 | Remove trailing whitespace | 2016-12-18 14:06:27 +01:00 |  | 
			
				
					| 
							
							
								 Matthew Honnibal | 0595cc0635 | Change test595 to mock data, instead of requiring model. | 2016-12-18 13:28:51 +01:00 |  | 
			
				
					| 
							
							
								 Ines Montani | f2c48ef504 | Resolve stopwords conflict to merge Dutch | 2016-12-17 13:08:16 +01:00 |  | 
			
				
					| 
							
							
								 Janneke van der Zwaan | 4a3fdcce8a | Merge github.com:explosion/spaCy into dutch | 2016-12-13 09:25:23 +01:00 |  | 
			
				
					| 
							
							
								 Gyorgy Orosz | 0cf2144d24 | Adding partial hyphen and quote handling support. | 2016-12-11 00:14:36 +01:00 |  | 
			
				
					| 
							
							
								 Gyorgy Orosz | 2051726fd3 | Passing Hungatian abbrev tests. | 2016-12-10 23:37:58 +01:00 |  | 
			
				
					| 
							
							
								 Gyorgy Orosz | 0289b8ceaa | Additional abbreviation tests. | 2016-12-08 12:17:44 +01:00 |  | 
			
				
					| 
							
							
								 Gyorgy Orosz | 5b00039955 | First steps towards the Hungarian tokenizer code. | 2016-12-07 23:07:43 +01:00 |  | 
			
				
					| 
							
							
								 Ines Montani | 8350d65695 | Change morphology and lemmatizer API Take morphology features as object instead of keyword arguments | 2016-12-07 21:12:49 +01:00 |  | 
			
				
					| 
							
							
								 Ines Montani | 52e7d634df | Remove trailing whitespace | 2016-12-07 21:12:19 +01:00 |  | 
			
				
					| 
							
							
								 Ines Montani | 07f0efb102 | Add test for tokenizer regular expressions | 2016-12-07 20:33:28 +01:00 |  | 
			
				
					| 
							
							
								 Matthew Honnibal | f6e356aada | Add (and test) Span.sentiment attribute. By default we average token.span, but can override with custom hook. Re Issue #667 | 2016-12-02 11:05:50 +01:00 |  | 
			
				
					| 
							
							
								 Janneke van der Zwaan | 88869e0e07 | Merge github.com:explosion/spaCy into dutch | 2016-11-30 17:13:39 +01:00 |  | 
			
				
					| 
							
							
								 Matthew Honnibal | 6652f2a135 | Test #656, #624: special case rules for tokenizer with attributes. | 2016-11-25 12:44:13 +01:00 |  | 
			
				
					| 
							
							
								 Matthew Honnibal | 53d8ca8f51 | Add spacy.attrs.intify_attrs function, to normalize strings in token attribute dictionaries. | 2016-11-25 11:34:30 +01:00 |  | 
			
				
					| 
							
							
								 dafnevk | 3db8b0d322 | Added language class and some language data (with some TODOs) for Dutch | 2016-11-24 15:56:38 +01:00 |  | 
			
				
					| 
							
							
								 Matthew Honnibal | e01c1875ee | Work on test for #615 | 2016-11-23 23:48:41 +01:00 |  | 
			
				
					| 
							
							
								 Matthew Honnibal | e86f440ca6 | Fix test for issue 617 | 2016-11-10 22:48:10 +01:00 |  |