spaCy

mirror of https://github.com/explosion/spaCy.git synced 2024-12-26 18:06:29 +03:00

Author	SHA1	Message	Date
Jordan Suchow	38ed265b7d	Tweak line spacing	2015-04-19 13:01:38 -07:00
Jordan Suchow	3a8d9b37a6	Remove trailing whitespace	2015-04-19 13:01:38 -07:00
Matthew Honnibal	47a4371fea	* Upd tokenizer with i.e. tests	2015-02-18 06:37:04 -05:00
leofidus	363473aeed	Add rokenizer test for zero length string	2015-02-10 08:20:32 -05:00
Matthew Honnibal	d0e08a5b57	* Upd index tests	2015-01-30 12:35:13 +11:00
Matthew Honnibal	706305ee26	* Upd tests for new meaning of 'string'	2015-01-24 07:22:30 +11:00
Matthew Honnibal	5ed8b2b98f	* Rename sic to orth	2015-01-23 02:08:25 +11:00
Matthew Honnibal	93d4bd6c2e	* Add test for ). in tokenizer	2015-01-22 22:25:18 +11:00
Matthew Honnibal	7d3c40de7d	* Tests passing after refactor. API has obvious warts, particularly in Token and Lexeme	2015-01-15 00:33:16 +11:00
Matthew Honnibal	81d878beb2	* Upd tests	2014-12-30 21:34:09 +11:00
Matthew Honnibal	91a5064b7f	* Upd tests	2014-12-26 14:26:27 +11:00
Matthew Honnibal	73f200436f	* Tests passing except for morphology/lemmatization stuff	2014-12-23 11:40:32 +11:00
Matthew Honnibal	0d9972f4b0	* Upd tokenizer test	2014-12-21 20:38:27 +11:00
Matthew Honnibal	302e09018b	* Work on fixing special-cases, reading them in as JSON objects so that they can specify lemmas	2014-12-09 14:48:01 +11:00
Matthew Honnibal	0de700b566	* Comment out tests of hyphenation, while we decide what hyphenation policy should be.	2014-11-05 02:03:22 +11:00
Matthew Honnibal	63114820cf	* Upd tests for tighter interface	2014-10-30 18:15:30 +11:00
Matthew Honnibal	13909a2e24	* Rewriting Lexeme serialization.	2014-10-29 23:19:38 +11:00
Matthew Honnibal	08ce602243	* Large refactor, particularly to Python API	2014-10-24 00:59:17 +11:00
Matthew Honnibal	6fb42c4919	* Add offsets to Tokens class. Some changes to interfaces, and reorganization of spacy.Lang	2014-10-14 16:17:45 +11:00
Matthew Honnibal	db191361ee	* Add new tests for fancier tokenization cases	2014-09-15 06:31:58 +02:00
Matthew Honnibal	5dcc1a426a	* Update tokenization tests for new tokenizer rules	2014-09-15 01:32:51 +02:00
Matthew Honnibal	985bc68327	* Fix bug with trailing punct on contractions. Reduced efficiency, and slightly hacky implementation.	2014-09-12 18:26:26 +02:00
Matthew Honnibal	b5b31c6b6e	* Avoid testing for object identity	2014-09-10 20:58:30 +02:00
Matthew Honnibal	c282e6d5fb	* Redesign proceeding	2014-08-28 19:45:09 +02:00
Matthew Honnibal	9815c7649e	* Refactor around Word objects, adapting tests. Tests passing, except for string views.	2014-08-23 19:55:06 +02:00
Matthew Honnibal	01469b0888	* Refactor spacy so that chunks return arrays of lexemes, so that there is properly one lexeme per word.	2014-08-18 19:14:00 +02:00
Matthew Honnibal	e4263a241a	* Tests passing for reorganized version	2014-07-07 04:23:46 +02:00

27 Commits