spaCy/spacy/tests/regression/test_issue2396.py

# coding: utf-8
from __future__ import unicode_literals

from ..util import get_doc

import pytest
import numpy


@pytest.mark.parametrize(
    "sentence,heads,matrix",
    [
        (
            "She created a test for spacy",
            [1, 0, 1, -2, -1, -1],
            numpy.array(
                [
                    [0, 1, 1, 1, 1, 1],
                    [1, 1, 1, 1, 1, 1],
                    [1, 1, 2, 3, 3, 3],
                    [1, 1, 3, 3, 3, 3],
                    [1, 1, 3, 3, 4, 4],
                    [1, 1, 3, 3, 4, 5],
                ],
                dtype=numpy.int32,
            ),
        )
    ],
)
def test_issue2396(en_tokenizer, sentence, heads, matrix):
    tokens = en_tokenizer(sentence)
    doc = get_doc(tokens.vocab, [t.text for t in tokens], heads=heads)
    span = doc[:]
    assert (doc.get_lca_matrix() == matrix).all()
    assert (span.get_lca_matrix() == matrix).all()
Fix issue 2396 (#3089) * Test on #2396: bug in Doc.get_lca_matrix() * reimplementation of Doc.get_lca_matrix(), (closes #2396) * reimplement Span.get_lca_matrix(), and call it from Doc.get_lca_matrix() * tests Span.get_lca_matrix() as well as Doc.get_lca_matrix() * implement _get_lca_matrix as a helper function in doc.pyx; call it from Doc.get_lca_matrix and Span.get_lca_matrix * use memory view instead of np.ndarray in _get_lca_matrix (faster) * fix bug when calling Span.get_lca_matrix; return lca matrix as np.array instead of memoryview * cleaner conditional, add comment 2018-12-29 20:02:26 +03:00			`# coding: utf-8`
			`from __future__ import unicode_literals`

			`from ..util import get_doc`

			`import pytest`
			`import numpy`

Update get_lca_matrix test for develop 2018-12-30 16:27:04 +03:00
Merge branch 'master' into develop 2019-02-07 22:54:07 +03:00			`@pytest.mark.parametrize(`
			`"sentence,heads,matrix",`
			`[`
			`(`
			`"She created a test for spacy",`
			`[1, 0, 1, -2, -1, -1],`
			`numpy.array(`
			`[`
			`[0, 1, 1, 1, 1, 1],`
			`[1, 1, 1, 1, 1, 1],`
			`[1, 1, 2, 3, 3, 3],`
			`[1, 1, 3, 3, 3, 3],`
			`[1, 1, 3, 3, 4, 4],`
			`[1, 1, 3, 3, 4, 5],`
			`],`
			`dtype=numpy.int32,`
			`),`
			`)`
			`],`
			`)`
Bugfix/get lca matrix (#3110) This PR adds a test for an untested case of `Span.get_lca_matrix`, and fixes a bug for that scenario, which I introduced in [this PR](https://github.com/explosion/spaCy/pull/3089) (sorry!). ## Description The previous implementation of get_lca_matrix was failing for the case `doc[j:k].get_lca_matrix()` where `j > 0`. A test has been added for this case and the bug has been fixed. ### Types of change Bug fix ## Checklist - [x] I have submitted the spaCy Contributor Agreement. - [x] I ran the tests, and all new and existing tests passed. - [x] My changes don't require a change to the documentation, or if they do, I've added all required information. 2019-01-06 21:07:50 +03:00			`def test_issue2396(en_tokenizer, sentence, heads, matrix):`
			`tokens = en_tokenizer(sentence)`
			`doc = get_doc(tokens.vocab, [t.text for t in tokens], heads=heads)`
Fix issue 2396 (#3089) * Test on #2396: bug in Doc.get_lca_matrix() * reimplementation of Doc.get_lca_matrix(), (closes #2396) * reimplement Span.get_lca_matrix(), and call it from Doc.get_lca_matrix() * tests Span.get_lca_matrix() as well as Doc.get_lca_matrix() * implement _get_lca_matrix as a helper function in doc.pyx; call it from Doc.get_lca_matrix and Span.get_lca_matrix * use memory view instead of np.ndarray in _get_lca_matrix (faster) * fix bug when calling Span.get_lca_matrix; return lca matrix as np.array instead of memoryview * cleaner conditional, add comment 2018-12-29 20:02:26 +03:00			`span = doc[:]`
Fix issue 2396 (#3089) * Test on #2396: bug in Doc.get_lca_matrix() * reimplementation of Doc.get_lca_matrix(), (closes #2396) * reimplement Span.get_lca_matrix(), and call it from Doc.get_lca_matrix() * tests Span.get_lca_matrix() as well as Doc.get_lca_matrix() * implement _get_lca_matrix as a helper function in doc.pyx; call it from Doc.get_lca_matrix and Span.get_lca_matrix * use memory view instead of np.ndarray in _get_lca_matrix (faster) * fix bug when calling Span.get_lca_matrix; return lca matrix as np.array instead of memoryview * cleaner conditional, add comment 2018-12-29 20:02:26 +03:00			`assert (doc.get_lca_matrix() == matrix).all()`
			`assert (span.get_lca_matrix() == matrix).all()`