//- 💫 DOCS > USAGE > WORD VECTORS & SIMILARITIES include ../../_includes/_mixins p | Dense, real valued vectors representing distributional similarity | information are now a cornerstone of practical NLP. The most common way | to train these vectors is the #[+a("https://en.wikipedia.org/wiki/Word2vec") word2vec] | family of algorithms. The default | #[+a("/docs/usage/models#available") English model] installs | 300-dimensional vectors trained on the Common Crawl | corpus using the #[+a("http://nlp.stanford.edu/projects/glove/") GloVe] | algorithm. The GloVe common crawl vectors have become a de facto | standard for practical NLP. +aside("Tip: Training a word2vec model") | If you need to train a word2vec model, we recommend the implementation in | the Python library #[+a("https://radimrehurek.com/gensim/") Gensim]. +h(2, "101") Similarity and word vectors 101 +tag-model("vectors") include _spacy-101/_similarity include _spacy-101/_word-vectors +h(2, "custom") Customising word vectors p | By default, #[+api("token#vector") #[code Token.vector]] returns the | vector for its underlying #[+api("lexeme") #[code Lexeme]], while | #[+api("doc#vector") #[code Doc.vector]] and | #[+api("span#vector") #[code Span.vector]] return an average of the | vectors of their tokens. p | You can customize these | behaviours by modifying the #[code doc.user_hooks], | #[code doc.user_span_hooks] and #[code doc.user_token_hooks] | dictionaries. +code("Example"). # TODO p | You can load new word vectors from a file-like buffer using the | #[code vocab.load_vectors()] method. The file should be a | whitespace-delimited text file, where the word is in the first column, | and subsequent columns provide the vector data. For faster loading, you | can use the #[code vocab.vectors_from_bin_loc()] method, which accepts a | path to a binary file written by #[code vocab.dump_vectors()]. +code("Example"). # TODO p | You can also load vectors from memory by writing to the | #[+api("lexeme#vector") #[code Lexeme.vector]] property. If the vectors | you are writing are of different dimensionality | from the ones currently loaded, you should first call | #[code vocab.resize_vectors(new_size)]. +h(2, "similarity") Similarity