spaCy

mirror of https://github.com/explosion/spaCy.git synced 2025-12-12 04:34:31 +03:00

History

adrianeboyd 40e65d6f63 Fix most_similar for vectors with unused rows (#5348 ) * Fix most_similar for vectors with unused rows Address issues related to the unused rows in the vector table and `most_similar`: * Update `most_similar()` to search only through rows that are in use according to `key2row`. * Raise an error when `most_similar(n=n)` is larger than the number of vectors in the table. * Set and restore `_unset` correctly when vectors are added or deserialized so that new vectors are added in the correct row. * Set data and keys to the same length in `Vocab.prune_vectors()` to avoid spurious entries in `key2row`. * Fix regression test using `most_similar` Co-authored-by: Matthew Honnibal <honnibal+gh@gmail.com>		2020-05-19 16:41:26 +02:00
..
__init__.py	Revert #4334	2019-09-29 17:32:12 +02:00
test_lexeme.py	Reduce stored lexemes data, move feats to lookups (#5238 )	2020-05-19 15:59:14 +02:00
test_lookups.py	Reduce stored lexemes data, move feats to lookups (#5238 )	2020-05-19 15:59:14 +02:00
test_similarity.py	Revert #4334	2019-09-29 17:32:12 +02:00
test_stringstore.py	Revert #4334	2019-09-29 17:32:12 +02:00
test_vectors.py	Fix most_similar for vectors with unused rows (#5348 )	2020-05-19 16:41:26 +02:00
test_vocab_api.py	Revert #4334	2019-09-29 17:32:12 +02:00