spaCy

mirror of https://github.com/explosion/spaCy.git synced 2024-11-11 12:18:04 +03:00

Author	SHA1	Message	Date
Matthew Honnibal	4bea65a1a8	Fix Issue #1450 : Off-by-1 in * and ? matches Patterns that end in variable-length operators e.g. * and ? now end on the correct token. Previously, they were off by 1: the next token was pulled into the match, even if that's where the pattern failed.	2017-10-24 14:26:27 +02:00
Matthew Honnibal	c29927d2e7	Fix matcher test	2017-10-16 17:22:18 +02:00
Matthew Honnibal	748d525801	Add more matcher operator tests	2017-10-16 13:38:01 +02:00
Matthew Honnibal	2534cd57d7	Add bandaid solution to the 'shadowing' problem in #864	2017-10-09 08:59:35 +02:00
Matthew Honnibal	3b67eabfea	Allow empty dictionaries to match any token in Matcher Often patterns need to match "any token". A clean way to denote this is with the empty dict {}: this sets no constraints on the token, so should always match. The problem was that having attributes length==0 was used as an end-of-array signal, so the matcher didn't handle this case correctly. This patch compiles empty token spec dicts into a constraint NULL_ATTR==0. The NULL_ATTR attribute, 0, is always set to 0 on the lexeme -- so this always matches.	2017-10-07 03:36:15 +02:00
Matthew Honnibal	cc408fc189	Make PhraseMatcher API like Matcher API	2017-09-20 22:20:35 +02:00
Matthew Honnibal	43ad250dd5	Update matcher tests	2017-09-20 21:54:49 +02:00
ines	c5714d4fb2	xfail matcher test for now until setting norm via Span.merge works	2017-05-29 10:51:02 +02:00
ines	00b2094dc3	Fix typos, long integers and tests	2017-05-29 01:09:52 +02:00
Matthew Honnibal	3959d778ac	Revert "Revert "WIP on improving parser efficiency"" This reverts commit `532afef4a8`.	2017-05-23 03:06:53 -05:00
Matthew Honnibal	532afef4a8	Revert "WIP on improving parser efficiency" This reverts commit `bdaac7ab44`.	2017-05-23 03:05:25 -05:00
Matthew Honnibal	bdaac7ab44	WIP on improving parser efficiency	2017-05-23 02:59:31 -05:00
ines	b3c7ee0148	Fix tests and use the new Matcher API	2017-05-22 13:54:20 +02:00
Ines Montani	5f0d196a31	Modernise and merge matcher tests	2017-01-12 22:23:11 +01:00
Ines Montani	f8803808ce	Remove old unused tests and conftest files	2017-01-12 15:09:05 +01:00
Dmitry Sadovnychyi	86c056ba64	Add basic test for PhraseMatcher #613	2016-11-09 00:10:32 +08:00
Matthew Honnibal	7d446e5094	Revert "Update matcher test, to reflect character offset return instead of token offset." This reverts commit `f8d3e3bcfe`.	2016-10-17 16:49:49 +02:00
Matthew Honnibal	f8d3e3bcfe	Update matcher test, to reflect character offset return instead of token offset.	2016-10-17 16:00:10 +02:00
Matthew Honnibal	8951bf6989	Update matcher tests	2016-10-17 01:53:24 +02:00
Matthew Honnibal	bd7fe6420c	Revert "Changes to test for new string-store" This reverts commit `21e90d7d0b`.	2016-09-30 20:11:01 +02:00
Matthew Honnibal	21e90d7d0b	Changes to test for new string-store	2016-09-30 20:00:58 +02:00
Matthew Honnibal	95aaea0d3f	Refactor so that the tokenizer data is read from Python data, rather than from disk	2016-09-25 14:49:53 +02:00
Matthew Honnibal	83e364188c	Mostly finished loading refactoring. Design is in place, but doesn't work yet.	2016-09-24 15:42:01 +02:00
Matthew Honnibal	b00f683a0c	Fix matcher test	2016-09-24 11:20:58 +02:00
Matthew Honnibal	939a791a52	Update tests	2016-09-24 01:17:03 +02:00
Matthew Honnibal	58e83fe34b	Initial, limited support for quantified patterns in Matcher, and tracking of ent_id attribute in Token and Span. The quantifiers need a lot more testing, and there are some known problems. The main known problem is that the zero-plus and one-plus quantifiers won't work if a token can match both the quantified pattern expression AND the tail of the match.	2016-09-21 14:54:55 +02:00
Matthew Honnibal	4e16f9e435	* Move tests underneath spacy/	2015-10-26 00:07:31 +11:00

27 Commits