Commit Graph

2 Commits

Author SHA1 Message Date
Richard Hudson
32954c3bcb
Fix issues for Mypy 0.950 and Pydantic 1.9.0 (#10786)
* Make changes to typing

* Correction

* Format with black

* Corrections based on review

* Bumped Thinc dependency version

* Bumped blis requirement

* Correction for older Python versions

* Update spacy/ml/models/textcat.py

Co-authored-by: Daniël de Kok <me@github.danieldk.eu>

* Corrections based on review feedback

* Readd deleted docstring line

Co-authored-by: Daniël de Kok <me@github.danieldk.eu>
2022-05-25 09:33:54 +02:00
Adriane Boyd
85778dfcf4
Add edit tree lemmatizer (#10231)
* Add edit tree lemmatizer

Co-authored-by: Daniël de Kok <me@danieldk.eu>

* Hide edit tree lemmatizer labels

* Use relative imports

* Switch to single quotes in error message

* Type annotation fixes

Co-authored-by: Sofie Van Landeghem <svlandeg@users.noreply.github.com>

* Reformat edit_tree_lemmatizer with black

* EditTreeLemmatizer.predict: take Iterable

Co-authored-by: Sofie Van Landeghem <svlandeg@users.noreply.github.com>

* Validate edit trees during deserialization

This change also changes the serialized representation. Rather than
mirroring the deep C structure, we use a simple flat union of the match
and substitution node types.

* Move edit_trees to _edit_tree_internals

* Fix invalid edit tree format error message

* edit_tree_lemmatizer: remove outdated TODO comment

* Rename factory name to trainable_lemmatizer

* Ignore type instead of casting truths to List[Union[Ints1d, Floats2d, List[int], List[str]]] for thinc v8.0.14

* Switch to Tagger.v2

* Add documentation for EditTreeLemmatizer

* docs: Fix 3.2 -> 3.3 somewhere

* trainable_lemmatizer documentation fixes

* docs: EditTreeLemmatizer is in edit_tree_lemmatizer.py

Co-authored-by: Daniël de Kok <me@danieldk.eu>
Co-authored-by: Daniël de Kok <me@github.danieldk.eu>
Co-authored-by: Sofie Van Landeghem <svlandeg@users.noreply.github.com>
2022-03-28 11:13:50 +02:00