spaCy

mirror of https://github.com/explosion/spaCy.git synced 2025-10-04 02:46:40 +03:00

Author	SHA1	Message	Date
Ines Montani	8e962de39f	Remove old word vector tests	2017-01-11 18:55:08 +01:00
Ines Montani	e027936920	Modernise Doc noun chunks tests	2017-01-11 18:54:56 +01:00
Ines Montani	439f396acd	Modernise Doc array tests and don't depend on models	2017-01-11 18:54:46 +01:00
Ines Montani	05447be884	Modernise test for adding entities	2017-01-11 18:54:24 +01:00
Ines Montani	6e883f4c00	Modernise Doc API tests and don't depend on models	2017-01-11 18:05:36 +01:00
Ines Montani	8bf3bb5c44	Make words optional for get_doc	2017-01-11 18:05:10 +01:00
Ines Montani	928db7e419	Fix StringIO import for Python 3	2017-01-11 14:07:48 +01:00
Ines Montani	69998f216b	Rename test_tokens_api.py to test_doc_api.py	2017-01-11 13:58:56 +01:00
Ines Montani	d94dea1b18	Merge token tests into token API tests	2017-01-11 13:57:02 +01:00
Ines Montani	eb23424ab0	Modernise token API tests and don't depend on loading models	2017-01-11 13:56:54 +01:00
Ines Montani	c682b8ca90	Merge conftests into one cohesive file	2017-01-11 13:56:32 +01:00
Ines Montani	909f24d7df	Add test utils and get_doc helper function Create Doc object from given vocab, words and annotations to allow tests not to depend on loading the models.	2017-01-11 13:55:33 +01:00
Matthew Honnibal	e12c90e03f	Merge branch 'master' of ssh://github.com/explosion/spaCy	2017-01-11 13:03:51 +01:00
Matthew Honnibal	12cd27b821	Amend 8ae8b443f: Handle comparison with None tokens.	2017-01-11 13:03:32 +01:00
Daniel Hershcovich	8e603cc917	Avoid "True if ... else False"	2017-01-11 11:18:22 +02:00
Matthew Honnibal	44e2b0100d	Support TAG attribute in doc.from_array	2017-01-10 22:47:07 +01:00
Ines Montani	3e6e1f0251	Tidy up regression tests	2017-01-10 19:24:10 +01:00
Ines Montani	869963c3c4	Mark extensive prefix/suffix tests as slow	2017-01-10 15:57:35 +01:00
Ines Montani	487e020ebe	Add simple test for surrounding brackets	2017-01-10 15:57:26 +01:00
Ines Montani	0ba5cf51d2	Assert length first	2017-01-10 15:57:00 +01:00
Ines Montani	2185d31907	Adjust names and formatting	2017-01-10 15:56:35 +01:00
Ines Montani	e10d4ca964	Remove semi-redundant URLs and punctuation for faster testing	2017-01-10 15:54:25 +01:00
Ines Montani	3a3cb2c90c	Add unicode declaration	2017-01-10 15:53:15 +01:00
Matthew Honnibal	0f9b8a00a5	Unbreak data download	2017-01-09 23:40:26 +01:00
Matthew Honnibal	8ae8b443f1	Add richcmp method to Token. Closes #631	2017-01-09 19:30:31 +01:00
Matthew Honnibal	64f747cb65	Token comparison test	2017-01-09 19:12:00 +01:00
Matthew Honnibal	18c3c2d05c	Add tests for token comparison, re Issue #631	2017-01-09 19:09:59 +01:00
Matthew Honnibal	97a1286129	Revert changes to tagger and parser for thinc 6	2017-01-09 10:08:34 -06:00
Matthew Honnibal	95a52005df	Revert "Fix Issue #683 : Add 'SP' to tag_map, if it's not there already, within the Morphology class." This reverts commit `40e71586d6`.	2017-01-09 09:55:55 -06:00
Ines Montani	363f09e68c	Merge pull request #726 from magnusburton/master Added Swedish abbreviations as token exceptions	2017-01-09 14:58:15 +01:00
Matthew Honnibal	42cd598f57	Use correct fixtures in URL tokenizer	2017-01-09 14:10:40 +01:00
Matthew Honnibal	d9a77ddf14	Return None for data path if it doesn't exist	2017-01-09 14:10:05 +01:00
Matthew Honnibal	e4862d1dab	Merge branch 'develop'	2017-01-09 13:36:01 +01:00
Ines Montani	aa876884f0	Revert "Revert "Merge remote-tracking branch 'origin/master'"" This reverts commit `fb9d3bb022`.	2017-01-09 13:28:13 +01:00
Ines Montani	d5c72c40eb	Remove old tests for old website example code	2017-01-08 22:28:53 +01:00
Ines Montani	eef94e3ee2	Split off period after two or more uppercase letters (fixes #483 )	2017-01-08 22:28:25 +01:00
Ines Montani	a89a6000e5	Remove unused import	2017-01-08 22:17:37 +01:00
Ines Montani	5d28664fc5	Don't test Hungarian for numbers and hyphens for now Reinvestigate behaviour of case affixes given reorganised tokenizer patterns.	2017-01-08 20:45:40 +01:00
Ines Montani	53362b6b93	Reorganise Hungarian prefixes/suffixes/infixes Use global prefixes and suffixes for non-language-specific rules, import list of alpha unicode characters and adjust regexes.	2017-01-08 20:40:33 +01:00
Ines Montani	347c4a2d06	Reorganise and reformat global tokenizer prefixes, suffixes and infixes	2017-01-08 20:37:39 +01:00
Ines Montani	0dec90e9f7	Use global abbreviation data languages and remove duplicates	2017-01-08 20:36:00 +01:00
Ines Montani	7c3cb2a652	Add global abbreviations data	2017-01-08 20:34:03 +01:00
Ines Montani	de5aa92bc2	Handle deprecated tokenizer prefix data	2017-01-08 20:33:28 +01:00
Ines Montani	abb09782f9	Move sun.txt to original location and fix path to not break parser tests	2017-01-08 20:32:54 +01:00
Ines Montani	cab39c59c5	Add missing contractions to English tokenizer exceptions Inspired by https://github.com/kootenpv/contractions/blob/master/contractions/__init __.py	2017-01-05 19:59:06 +01:00
Ines Montani	a23504fe07	Move abbreviations below other exceptions	2017-01-05 19:58:07 +01:00
Ines Montani	7d2cf934b9	Generate he/she/it correctly with 's instead of 've	2017-01-05 19:57:00 +01:00
Ines Montani	8328925e1f	Add newlines to long German text	2017-01-05 18:13:30 +01:00
Ines Montani	55b46d7cf6	Add tokenizer tests for German	2017-01-05 18:11:25 +01:00
Ines Montani	5bb4081f52	Remove redundant test_tokenizer.py for English	2017-01-05 18:11:11 +01:00

1 2 3 4 5 ...

2192 Commits