Add all vectors to vocab before pruning (#6408)

Add all vectors to the vocab before pruning to correct the selection of
vectors to prioritize.
This commit is contained in:
Adriane Boyd 2020-11-23 10:00:59 +01:00 committed by GitHub
parent 13f0676f04
commit a8c2dad466
No known key found for this signature in database
GPG Key ID: 4AEE18F83AFDEB23

View File

@ -316,6 +316,9 @@ cdef class Vocab:
DOCS: https://spacy.io/api/vocab#prune_vectors DOCS: https://spacy.io/api/vocab#prune_vectors
""" """
xp = get_array_module(self.vectors.data) xp = get_array_module(self.vectors.data)
# Make sure all vectors are in the vocab
for orth in self.vectors:
self[orth]
# Make prob negative so it sorts by rank ascending # Make prob negative so it sorts by rank ascending
# (key2row contains the rank) # (key2row contains the rank)
priority = [(-lex.prob, self.vectors.key2row[lex.orth], lex.orth) priority = [(-lex.prob, self.vectors.key2row[lex.orth], lex.orth)