Merge branch 'spacy.io' [ci skip]

2025-12-19 08:04:42 +03:00 · 2021-03-06 17:38:54 +11:00 · 2021-03-06 17:38:54 +11:00 · dfb23a419e
commit dfb23a419e
parent 23eef78a4a
3 changed files with 18 additions and 19 deletions
--- a/website/docs/usage/v2-1.md
+++ b/website/docs/usage/v2-1.md
@ -180,7 +180,7 @@ entirely **in Markdown**, without having to compromise on easy-to-use custom UI
 components. We're hoping that the Markdown source will make it even easier to
 contribute to the documentation. For more details, check out the
 [styleguide](/styleguide) and
-[source](https://github.com/explosion/spaCy/tree/master/website). While
+[source](https://github.com/explosion/spacy/tree/v2.x/website). While
 converting the pages to Markdown, we've also fixed a bunch of typos, improved
 the existing pages and added some new content:
--- a/website/docs/usage/v2-3.md
+++ b/website/docs/usage/v2-3.md
@ -161,8 +161,8 @@ debugging your tokenizer configuration.
 spaCy's custom warnings have been replaced with native Python
 [`warnings`](https://docs.python.org/3/library/warnings.html). Instead of
-setting `SPACY_WARNING_IGNORE`, use the [`warnings`
+setting `SPACY_WARNING_IGNORE`, use the
-filters](https://docs.python.org/3/library/warnings.html#the-warnings-filter)
+[`warnings` filters](https://docs.python.org/3/library/warnings.html#the-warnings-filter)
 to manage warnings.
 ```diff
@ -176,7 +176,7 @@ import spacy
 #### Normalization tables
 The normalization tables have moved from the language data in
-[`spacy/lang`](https://github.com/explosion/spaCy/tree/master/spacy/lang) to the
+[`spacy/lang`](https://github.com/explosion/spacy/tree/v2.x/spacy/lang) to the
 package [`spacy-lookups-data`](https://github.com/explosion/spacy-lookups-data).
 If you're adding data for a new language, the normalization table should be
 added to `spacy-lookups-data`. See
@ -190,8 +190,8 @@ lexemes will be added to the vocab automatically, just as in small models
 without vectors.
 To see the number of unique vectors and number of words with vectors, see
-`nlp.meta['vectors']`, for example for `en_core_web_md` there are `20000`
+`nlp.meta['vectors']`, for example for `en_core_web_md` there are `20000` unique
-unique vectors and `684830` words with vectors:
+vectors and `684830` words with vectors:
 ```python
 {
@ -210,8 +210,8 @@ for orth in nlp.vocab.vectors:
    _ = nlp.vocab[orth]
 ```
-If your workflow previously iterated over `nlp.vocab`, a similar alternative
+If your workflow previously iterated over `nlp.vocab`, a similar alternative is
-is to iterate over words with vectors instead:
+to iterate over words with vectors instead:
 ```diff
 - lexemes = [w for w in nlp.vocab]
@ -220,9 +220,9 @@ is to iterate over words with vectors instead:
 Be aware that the set of preloaded lexemes in a v2.2 model is not equivalent to
 the set of words with vectors. For English, v2.2 `md/lg` models have 1.3M
-provided lexemes but only 685K words with vectors. The vectors have been
+provided lexemes but only 685K words with vectors. The vectors have been updated
-updated for most languages in v2.2, but the English models contain the same
+for most languages in v2.2, but the English models contain the same vectors for
-vectors for both v2.2 and v2.3.
+both v2.2 and v2.3.
 #### Lexeme.is_oov and Token.is_oov
@ -234,8 +234,7 @@ fixed in the next patch release v2.3.1.
 </Infobox>
 In v2.3, `Lexeme.is_oov` and `Token.is_oov` are `True` if the lexeme does not
-have a word vector. This is equivalent to `token.orth not in
+have a word vector. This is equivalent to `token.orth not in nlp.vocab.vectors`.
 nlp.vocab.vectors`.
 Previously in v2.2, `is_oov` corresponded to whether a lexeme had stored
 probability and cluster features. The probability and cluster features are no
@ -270,8 +269,8 @@ as part of the model vocab.
 To load the probability table into a provided model, first make sure you have
 `spacy-lookups-data` installed. To load the table, remove the empty provided
-`lexeme_prob` table and then access `Lexeme.prob` for any word to load the
+`lexeme_prob` table and then access `Lexeme.prob` for any word to load the table
-table from `spacy-lookups-data`:
+from `spacy-lookups-data`:
 ```diff
 + # prerequisite: pip install spacy-lookups-data
@ -321,9 +320,9 @@ the [train CLI](/api/cli#train), you can use the new `--tag-map-path` option to
 provide in the tag map as a JSON dict.
 If you want to export a tag map from a provided model for use with the train
-CLI, you can save it as a JSON dict. To only use string keys as required by
+CLI, you can save it as a JSON dict. To only use string keys as required by JSON
-JSON and to make it easier to read and edit, any internal integer IDs need to
+and to make it easier to read and edit, any internal integer IDs need to be
-be converted back to strings:
+converted back to strings:
 ```python
 import spacy
--- a/website/docs/usage/v2.md
+++ b/website/docs/usage/v2.md
@ -303,7 +303,7 @@ lookup-based lemmatization – and **many new languages**!
 <Infobox>
 **API:** [`Language`](/api/language) **Code:**
-[`spacy/lang`](https://github.com/explosion/spaCy/tree/master/spacy/lang)
+[`spacy/lang`](https://github.com/explosion/spacy/tree/v2.x/spacy/lang)
 **Usage:** [Adding languages](/usage/adding-languages)
 </Infobox>