spaCy/spacy/tests/matcher
Adriane Boyd 0d9740e826 Replace PhraseMatcher with Aho-Corasick
Replace PhraseMatcher with the Aho-Corasick algorithm over numpy arrays
of the hash values for the relevant attribute. The implementation is
based on FlashText.

The speed should be similar to the previous PhraseMatcher. It is now
possible to easily remove match IDs and matches don't go missing with
large keyword lists / vocabularies.

Fixes #4308.
2019-09-19 16:49:05 +02:00
..
__init__.py 💫 Refactor test suite (#2568) 2018-07-24 23:38:44 +02:00
test_matcher_api.py Tidy up and auto-format [ci skip] 2019-08-31 13:39:06 +02:00
test_matcher_logic.py Tidy up and auto-format 2019-08-20 17:36:34 +02:00
test_pattern_validation.py Improve token pattern checking without validation (#4105) 2019-08-21 14:00:37 +02:00
test_phrase_matcher.py Replace PhraseMatcher with Aho-Corasick 2019-09-19 16:49:05 +02:00