mirror of
https://github.com/explosion/spaCy.git
synced 2024-12-27 10:26:35 +03:00
2eb925bd05
* Perserve flags in EntityRuler The EntityRuler (explosion/spaCy#3526) does not preserve overwrite flags (or `ent_id_sep`) when serialized. This commit adds support for serialization/deserialization preserving overwrite and ent_id_sep flags. * add signed contributor agreement * flake8 cleanup mostly blank line issues. * mark test from the issue as needing a model The test from the issue needs some language model for serialization but the test wasn't originally marked correctly. * Adds `phrase_matcher_attr` to allow args to PhraseMatcher This is an added arg to pass to the `PhraseMatcher`. For example, this allows creation of a case insensitive phrase matcher when the `EntityRuler` is created. References explosion/spaCy#3822 * remove unneeded model loading The model didn't need to be loaded, and I replaced it with a change that doesn't require it (using existings fixtures) * updated docstring for new argument * updated docs to reflect new argument to the EntityRuler constructor * change tempdir handling to be compatible with python 2.7 * return conflicted code to entityruler Some stuff got cut out because of merge conflicts, this returns that code for the phrase_matcher_attr. * fixed typo in the code added back after conflicts * flake8 compliance When I deconflicted the branch there were some flake8 issues introduced. This resolves the spacing problems. * test changes: attempts to fix flaky test in python3.5 These tests seem to be alittle flaky in 3.5 so I changed the check to avoid the comparisons that seem to be fail sometimes. |
||
---|---|---|
.. | ||
annotation.md | ||
cli.md | ||
cython-classes.md | ||
cython-structs.md | ||
cython.md | ||
dependencyparser.md | ||
doc.md | ||
entityrecognizer.md | ||
entityruler.md | ||
goldcorpus.md | ||
goldparse.md | ||
index.md | ||
language.md | ||
lemmatizer.md | ||
lexeme.md | ||
matcher.md | ||
phrasematcher.md | ||
pipeline-functions.md | ||
scorer.md | ||
sentencizer.md | ||
span.md | ||
stringstore.md | ||
tagger.md | ||
textcategorizer.md | ||
token.md | ||
tokenizer.md | ||
top-level.md | ||
vectors.md | ||
vocab.md |