spaCy/spacy
Paul O'Leary McCann 2a7e327310
Fix Dependency Matcher Ordering Issue (#9337)
* Fix inconsistency

This makes the failing test pass, so that behavior is consistent whether
patterns are added in one call or two.

The issue is that the hash for patterns depended on the index of the
pattern in the list of current patterns, not the list of total patterns,
so a second call would get identical match ids.

* Add illustrative test case

* Add failing test for remove case

Patterns are not removed from the internal matcher on calls to remove,
which causes spurious weird matches (or misses).

* Fix removal issue

Remove patterns from the internal matcher.

* Check that the single add call also gets no matches
2021-10-11 10:26:13 +02:00
..
cli avoid crash when unicode in title (#9254) 2021-09-22 21:01:34 +02:00
displacy Adjust kb_id visualizer templating and docs 2021-09-23 11:59:02 +02:00
lang Fix verbs list in lang/fr/tokenizer_exceptions.py (#9033) 2021-08-25 15:55:09 +02:00
matcher Fix Dependency Matcher Ordering Issue (#9337) 2021-10-11 10:26:13 +02:00
ml Correct parser.py use_upper param info (#9180) 2021-09-10 16:19:58 +02:00
pipeline Auto-format code with black (#9065) 2021-08-27 11:42:27 +02:00
tests Fix Dependency Matcher Ordering Issue (#9337) 2021-10-11 10:26:13 +02:00
tokens Auto-format code with black (#9346) 2021-10-01 11:17:11 +02:00
training Sync vocab in vectors and components sourced in configs (#9335) 2021-10-04 12:19:02 +02:00
__init__.pxd * Seems to be working after refactor. Need to wire up more POS tag features, and wire up save/load of POS tags. 2014-10-24 02:23:42 +11:00
__init__.py Tidy up and auto-format 2021-07-18 15:44:56 +10:00
__main__.py Tidy up 2020-06-22 00:45:40 +02:00
about.py Prepare for v3.1.3 (#9200) 2021-09-14 11:03:51 +02:00
attrs.pxd Merge branch 'develop' into master-tmp 2020-05-21 18:39:06 +02:00
attrs.pyx Remove unsupported attrs from attrs.IDS (#8132) 2021-06-02 19:16:57 +10:00
compat.py Auto-detect package dependencies in spacy package (#8948) 2021-08-17 14:05:13 +02:00
default_config_pretraining.cfg pretrain architectures (#6451) 2020-12-08 14:41:03 +08:00
default_config.cfg Add training option to set annotations on update (#7767) 2021-04-26 16:53:53 +02:00
errors.py Raise E983 early on in docbin init (#9247) 2021-09-27 20:43:03 +02:00
glossary.py Add glossary entry for _SP (#8983) 2021-08-20 12:04:02 +02:00
kb.pxd Replace cpdef variables with cdef (#7834) 2021-04-26 16:54:02 +02:00
kb.pyx KB & NEL to/from bytes (#8113) 2021-05-20 18:11:30 +10:00
language.py Sync vocab in vectors and components sourced in configs (#9335) 2021-10-04 12:19:02 +02:00
lexeme.pxd Fix Lexeme.from_ptr 2020-08-10 16:43:37 +02:00
lexeme.pyi Add stub files for main cython classes (#8427) 2021-08-07 12:30:03 +02:00
lexeme.pyx fix 's typo's across code base (#8384) 2021-06-15 10:57:08 +02:00
lookups.py Tidy up code 2021-06-28 12:08:15 +02:00
morphology.pxd Clean up Morphology imports and definitions (#7441) 2021-04-26 16:54:23 +02:00
morphology.pyx Clean up Morphology imports and definitions (#7441) 2021-04-26 16:54:23 +02:00
parts_of_speech.pxd Add support for Universal Dependencies v2.0 2017-03-03 13:17:34 +01:00
parts_of_speech.pyx Drop Python 2.7 and 3.5 (#4828) 2019-12-22 01:53:56 +01:00
pipe_analysis.py Tidy up and auto-format 2020-09-29 21:39:28 +02:00
py.typed Add py.typed 2021-03-16 09:48:31 +01:00
schemas.py Support list values and INTERSECTS in Matcher (#8784) 2021-08-02 19:39:26 +02:00
scorer.py Tidy up code 2021-06-28 12:08:15 +02:00
strings.pxd Remove 'cleanup' of strings (#6007) 2020-09-01 16:12:15 +02:00
strings.pyi Add stub files for main cython classes (#8427) 2021-08-07 12:30:03 +02:00
strings.pyx Make vocab update in get_docs deterministic (#7603) 2021-04-09 11:53:13 +02:00
structs.pxd Add SpanGroup and Graph container types to represent arbitrary annotations (#6696) 2021-01-14 17:30:41 +11:00
symbols.pxd introduce token.has_head and refer to MISSING_DEP_ (WIP) 2021-01-12 17:17:06 +01:00
symbols.pyx introduce token.has_head and refer to MISSING_DEP_ (WIP) 2021-01-12 17:17:06 +01:00
tokenizer.pxd Replace cpdef variables with cdef (#7834) 2021-04-26 16:54:02 +02:00
tokenizer.pyx Pass excludes when serializing vocab (#8824) 2021-08-03 14:42:44 +02:00
typedefs.pxd Merge remote-tracking branch 'upstream/master' into chore/update-develop-from-master 2020-11-25 11:49:34 +01:00
typedefs.pyx Tidy up rest 2017-10-27 21:07:59 +02:00
util.py Handle spacy-legacy in package CLI for dependencies (#9163) 2021-09-08 11:46:40 +02:00
vectors.pyx Fix vectors data on GPU (#7626) 2021-04-19 18:30:03 +10:00
vocab.pxd Replace cpdef variables with cdef (#7834) 2021-04-26 16:54:02 +02:00
vocab.pyi Add stub files for main cython classes (#8427) 2021-08-07 12:30:03 +02:00
vocab.pyx Skip vector ngram backoff if minn is not set (#7925) 2021-05-06 18:34:35 +10:00