mirror of
https://github.com/explosion/spaCy.git
synced 2024-11-13 13:17:06 +03:00
40e65d6f63
* Fix most_similar for vectors with unused rows Address issues related to the unused rows in the vector table and `most_similar`: * Update `most_similar()` to search only through rows that are in use according to `key2row`. * Raise an error when `most_similar(n=n)` is larger than the number of vectors in the table. * Set and restore `_unset` correctly when vectors are added or deserialized so that new vectors are added in the correct row. * Set data and keys to the same length in `Vocab.prune_vectors()` to avoid spurious entries in `key2row`. * Fix regression test using `most_similar` Co-authored-by: Matthew Honnibal <honnibal+gh@gmail.com> |
||
---|---|---|
.. | ||
__init__.py | ||
test_lexeme.py | ||
test_lookups.py | ||
test_similarity.py | ||
test_stringstore.py | ||
test_vectors.py | ||
test_vocab_api.py |