spaCy/spacy/tests/regression/test_issue850.py

# coding: utf-8
from __future__ import unicode_literals
import pytest

from ...matcher import Matcher
from ...vocab import Vocab
from ...attrs import LOWER
from ...tokens import Doc


def test_basic_case():
    """Test Matcher matches with '*' operator and Boolean flag"""
    matcher = Matcher(Vocab(
                lex_attr_getters={LOWER: lambda string: string.lower()}))
    IS_ANY_TOKEN = matcher.vocab.add_flag(lambda x: True)
    matcher.add('FarAway', None, [{'LOWER': "bob"}, {'OP': '*', 'LOWER': 'and'}, {'LOWER': 'frank'}])
    doc = Doc(matcher.vocab, words=['bob', 'and', 'and', 'frank'])
    match = matcher(doc)
    assert len(match) == 1
    ent_id, start, end = match[0]
    assert start == 0
    assert end == 4


def test_issue850():
    """The variable-length pattern matches the
    succeeding token. Check we handle the ambiguity correctly."""
    matcher = Matcher(Vocab(
                lex_attr_getters={LOWER: lambda string: string.lower()}))
    IS_ANY_TOKEN = matcher.vocab.add_flag(lambda x: True)
    matcher.add('FarAway', None, [{'LOWER': "bob"}, {'OP': '*', 'IS_ANY_TOKEN': True}, {'LOWER': 'frank'}])
    doc = Doc(matcher.vocab, words=['bob', 'and', 'and', 'frank'])
    match = matcher(doc)
    assert len(match) == 1
    ent_id, start, end = match[0]
    assert start == 0
    assert end == 4
Fix tests and use the new Matcher API 2017-05-22 14:54:20 +03:00			`# coding: utf-8`
Update tests 2017-06-05 03:26:13 +03:00			`from __future__ import unicode_literals`
Add test for 850: Matcher fails on zero-or-more. 2017-03-07 17:55:28 +03:00			`import pytest`

			`from ...matcher import Matcher`
			`from ...vocab import Vocab`
			`from ...attrs import LOWER`
			`from ...tokens import Doc`


Update regression test for variable-length pattern problem in the matcher. 2017-03-07 18:08:32 +03:00			`def test_basic_case():`
Fix tests and use the new Matcher API 2017-05-22 14:54:20 +03:00			`"""Test Matcher matches with '*' operator and Boolean flag"""`
Update regression test for variable-length pattern problem in the matcher. 2017-03-07 18:08:32 +03:00			`matcher = Matcher(Vocab(`
			`lex_attr_getters={LOWER: lambda string: string.lower()}))`
			`IS_ANY_TOKEN = matcher.vocab.add_flag(lambda x: True)`
Fix tests and use the new Matcher API 2017-05-22 14:54:20 +03:00			`matcher.add('FarAway', None, [{'LOWER': "bob"}, {'OP': '*', 'LOWER': 'and'}, {'LOWER': 'frank'}])`
Update regression test for variable-length pattern problem in the matcher. 2017-03-07 18:08:32 +03:00			`doc = Doc(matcher.vocab, words=['bob', 'and', 'and', 'frank'])`
			`match = matcher(doc)`
			`assert len(match) == 1`
Fix tests and use the new Matcher API 2017-05-22 14:54:20 +03:00			`ent_id, start, end = match[0]`
Update regression test for variable-length pattern problem in the matcher. 2017-03-07 18:08:32 +03:00			`assert start == 0`
			`assert end == 4`

Whitespace 2017-03-07 19:16:26 +03:00
Add test for 850: Matcher fails on zero-or-more. 2017-03-07 17:55:28 +03:00			`def test_issue850():`
Revert "Merge branch 'develop' of https://github.com/explosion/spaCy into develop" This reverts commit c9ba3d3c2dc7067cf8bd55f878cec45a8c6d73d4, reversing changes made to 92c26a35d425d4e8ca1b805ea776ea10f5ded3df. 2018-03-27 20:23:02 +03:00			`"""The variable-length pattern matches the`
			`succeeding token. Check we handle the ambiguity correctly."""`
Update regression test for variable-length pattern problem in the matcher. 2017-03-07 18:08:32 +03:00			`matcher = Matcher(Vocab(`
			`lex_attr_getters={LOWER: lambda string: string.lower()}))`
Add test for 850: Matcher fails on zero-or-more. 2017-03-07 17:55:28 +03:00			`IS_ANY_TOKEN = matcher.vocab.add_flag(lambda x: True)`
Fix tests and use the new Matcher API 2017-05-22 14:54:20 +03:00			`matcher.add('FarAway', None, [{'LOWER': "bob"}, {'OP': '*', 'IS_ANY_TOKEN': True}, {'LOWER': 'frank'}])`
Update regression test for variable-length pattern problem in the matcher. 2017-03-07 18:08:32 +03:00			`doc = Doc(matcher.vocab, words=['bob', 'and', 'and', 'frank'])`
Add test for 850: Matcher fails on zero-or-more. 2017-03-07 17:55:28 +03:00			`match = matcher(doc)`
			`assert len(match) == 1`
Fix tests and use the new Matcher API 2017-05-22 14:54:20 +03:00			`ent_id, start, end = match[0]`
Add test for 850: Matcher fails on zero-or-more. 2017-03-07 17:55:28 +03:00			`assert start == 0`
			`assert end == 4`