spaCy

mirror of https://github.com/explosion/spaCy.git synced 2024-11-14 13:47:13 +03:00

Author	SHA1	Message	Date
Eric Zhao	d61c117081	Lowest common ancestor matrix for spans and docs Added functionality for spans and docs to get lowest common ancestor matrix by simply calling: doc.get_lca_matrix() or doc[:3].get_lca_matrix(). Corresponding unit tests were also added under spacy/tests/doc and spacy/tests/spans. Designed to address: https://github.com/explosion/spaCy/issues/969.	2017-09-03 12:22:19 -07:00
ines	dcff10abe9	Add regression test for #1281	2017-08-21 16:11:47 +02:00
Matthew Honnibal	796b2f4c1b	Remove print statements in tests	2017-07-22 15:42:38 +02:00
Matthew Honnibal	4b2e5e59ed	Add flush_cache method to tokenizer, to fix #1061 The tokenizer caches output for common chunks, for efficiency. This cache is be invalidated when the tokenizer rules change, e.g. when a new special-case rule is introduced. That's what was causing #1061. When the cache is flushed, we free the intermediate token chunks. I think this is safe --- but if we start getting segfaults, this patch is to blame. The resolution would be to simply not free those bits of memory. They'll be freed when the tokenizer exits anyway.	2017-07-22 15:06:50 +02:00
Matthew Honnibal	d9b85675d7	Rename regression test	2017-07-22 14:14:35 +02:00
Matthew Honnibal	dfbc7e49de	Add test for Issue #1207	2017-07-22 14:14:01 +02:00
Matthew Honnibal	0ae3807d7d	Fix gaps in Lexeme API. Closes #1031	2017-07-22 13:53:48 +02:00
Paul O'Leary McCann	bc87b815cc	Add comment clarifying what LANGUAGES does	2017-07-09 16:28:55 +09:00
Paul O'Leary McCann	04e6a65188	Remove Japanese from LANGUAGES LANGUAGES is a list of languages whose tokenizers get run through a variety of generic tests. Since the generic tests don't check the JA fixture, it blows up when it can't find janome. -POLM	2017-07-09 16:23:26 +09:00
Paul O'Leary McCann	c336193392	Parametrize and extend Japanese tokenizer tests	2017-06-29 00:09:40 +09:00
Paul O'Leary McCann	30a34ebb6e	Add importorskip for janome	2017-06-29 00:09:20 +09:00
Paul O'Leary McCann	e56fea14eb	Add basic Japanese tokenizer test	2017-06-28 01:24:25 +09:00
ines	6e1dbc608e	Fix parse_tree test	2017-05-13 12:34:20 +02:00
Matthew Honnibal	ad590feaa8	Fix test, which imported English incorrectly	2017-05-13 11:36:19 +02:00
Matthew Honnibal	b2540d2379	Merge Kengz's tree_print patch	2017-05-13 03:18:49 +02:00
Ines Montani	7da9cefd25	Merge pull request #1022 from luvogels/master Initial support for Norwegian Bokmål	2017-04-27 11:16:06 +02:00
luvogels	d12a0b6431	Hooked up tokenizer tests	2017-04-26 23:21:41 +02:00
luvogels	8de59ce3b9	Added tokenizer tests	2017-04-26 19:10:18 +02:00
Matthew Honnibal	4d98511db7	Make Span hashable. Closes #1019	2017-04-26 19:01:05 +02:00
Matthew Honnibal	24c4c51f13	Try to make test999 less flakey	2017-04-26 18:42:06 +02:00
Matthew Honnibal	c4be9c36fe	Fix unicode header in tests	2017-04-24 10:09:01 +02:00
Matthew Honnibal	65f10b53e5	Fix test	2017-04-24 00:25:55 +02:00
Matthew Honnibal	70a43858e1	Fix flakey test	2017-04-24 00:06:30 +02:00
Matthew Honnibal	3973af2d15	Make training test less flakey	2017-04-23 22:59:34 +02:00
ines	42305bc519	Remove unnecessary test	2017-04-23 21:21:41 +02:00
ines	012ea594d1	Add file for misc tests	2017-04-23 21:06:51 +02:00
ines	83f66947dc	Rename test_download to test_cli	2017-04-23 21:06:50 +02:00
Matthew Honnibal	874a3cbb07	Add test for Issue #955	2017-04-23 17:57:01 +02:00
Matthew Honnibal	5d8af40445	Add test for Issue #999	2017-04-23 17:06:30 +02:00
Matthew Honnibal	040751ad17	Remove xfail on Test #910	2017-04-23 16:28:55 +02:00
Ben Eyal	e90e8a3f10	Enable test	2017-04-20 02:25:24 +03:00
ines	2bd89e7ade	Tidy up Hebrew tests and test for punctuation (see #995 )	2017-04-19 19:28:03 +02:00
ines	13d30b6c01	xfail lemmatizer test that's causing problems (see #546 )	2017-04-16 21:18:39 +02:00
ines	0084466a66	Remove unused utf8open util and replace os.path with ensure_path	2017-04-16 20:37:45 +02:00
Matthew Honnibal	1dca7eeb03	Add unicode declaration on new regression test	2017-04-07 18:09:23 +02:00
ines	887827fc6a	Merge branch 'develop'	2017-04-07 17:36:23 +02:00
ines	444dd511c5	Fix xpassing URL test case	2017-04-07 17:36:05 +02:00
ines	bf0f15e762	Add / to tokenizer infixes (resolves #891 )	2017-04-07 17:30:44 +02:00
ines	00b9011a49	Fix whitespace	2017-04-07 17:29:59 +02:00
Matthew Honnibal	0513c43bf0	Merge branch 'master' of https://github.com/explosion/spaCy	2017-04-07 17:07:10 +02:00
Matthew Honnibal	cc36c308f4	Fix noun_chunk rules around coordination Closes #693.	2017-04-07 17:06:40 +02:00
Matthew Honnibal	ab846256cf	Merge pull request #966 from recognai/master Prepare Spanish language for training models, including configuration, rich-UD tag map and tests	2017-04-07 16:12:29 +02:00
Matthew Honnibal	83dca920d4	Rename test #913 -> #957 , comment Make test for #957 reference correct bug. Add comment. Previous commit closes #957.	2017-04-07 15:54:25 +02:00
Matthew Honnibal	5887383fc0	Add test for Issue #913 : Hang from bad regex	2017-04-07 15:47:27 +02:00
oeg	c693d40791	feature(model): Add support for creating the Spanish model, including rich tagset, configuration, and basich tests	2017-04-06 18:48:45 +02:00
Matthew Honnibal	cfff4e0f61	Improve test	2017-03-31 13:59:32 +02:00
Matthew Honnibal	e854f28304	Add test for Issue #758 Issue #758 occurs when no actions are available for a single token doc after merging.	2017-03-31 13:26:25 +02:00
Matthew Honnibal	0fefdfcbda	Merge pull request #935 from ericzhao28/master Add option to use label=ent_type in doc.merge arguments (Bug fix for issue #862)	2017-03-30 02:51:24 +02:00
Eric Zhao	aafdf6ffb8	Add option to use label karg to determine ent_type in doc.merge	2017-03-28 23:35:03 -07:00
Matthew Honnibal	b94286de30	Fix regression test	2017-03-25 22:35:07 +01:00

1 2 3 4 5 ...

557 Commits