Docs for v3.3 (#10628)

* Temporarily disable CI tests * Start v3.3 website updates * Add trainable lemmatizer to pipeline design * Fix Vectors.most_similar * Add floret vector info to pipeline design * Add Lower and Upper Sorbian * Add span to sidebar * Work on release notes * Copy from release notes * Update pipeline design graphic * Upgrading note about Doc.from_docs * Add tables and details * Update website/docs/models/index.md Co-authored-by: Sofie Van Landeghem <svlandeg@users.noreply.github.com> * Fix da lemma acc * Add minimal intro, various updates * Round lemma acc * Add section on floret / word lists * Add new pipelines table, minor edits * Fix displacy spans example title * Clarify adding non-trainable lemmatizer * Update adding-languages URLs * Revert "Temporarily disable CI tests" This reverts commit 1dee505920. * Spell out words/sec Co-authored-by: Sofie Van Landeghem <svlandeg@users.noreply.github.com>
2025-07-15 10:42:34 +03:00 · 2022-04-28 14:09:35 +02:00 · 2022-04-28 14:09:35 +02:00 · 497a708c71
commit 497a708c71
parent 10377fb945
10 changed files with 407 additions and 82 deletions
--- a/website/docs/api/doc.md
+++ b/website/docs/api/doc.md
@ -621,7 +621,7 @@ relative clauses.
 To customize the noun chunk iterator in a loaded pipeline, modify
 [`nlp.vocab.get_noun_chunks`](/api/vocab#attributes). If the `noun_chunk`
-[syntax iterator](/usage/adding-languages#language-data) has not been
+[syntax iterator](/usage/linguistic-features#language-data) has not been
 implemented for the given language, a `NotImplementedError` is raised.
 > #### Example
--- a/website/docs/api/span.md
+++ b/website/docs/api/span.md
@ -283,8 +283,9 @@ objects, if the document has been syntactically parsed. A base noun phrase, or
 it – so no NP-level coordination, no prepositional phrases, and no relative
 clauses.
-If the `noun_chunk` [syntax iterator](/usage/adding-languages#language-data) has
+If the `noun_chunk` [syntax iterator](/usage/linguistic-features#language-data)
-not been implemeted for the given language, a `NotImplementedError` is raised.
+has not been implemeted for the given language, a `NotImplementedError` is
 raised.
 > #### Example
 >
@ -520,12 +521,13 @@ sent = doc[sent.start : max(sent.end, span.end)]
 ## Span.sents {#sents tag="property" model="sentences" new="3.2.1"}
-Returns a generator over the sentences the span belongs to. This property is only available
+Returns a generator over the sentences the span belongs to. This property is
-when [sentence boundaries](/usage/linguistic-features#sbd) have been set on the
+only available when [sentence boundaries](/usage/linguistic-features#sbd) have
-document by the `parser`, `senter`, `sentencizer` or some custom function. It
+been set on the document by the `parser`, `senter`, `sentencizer` or some custom
-will raise an error otherwise.
+function. It will raise an error otherwise.
-If the span happens to cross sentence boundaries, all sentences the span overlaps with will be returned.
+If the span happens to cross sentence boundaries, all sentences the span
 overlaps with will be returned.
 > #### Example
 >
--- a/website/docs/api/vectors.md
+++ b/website/docs/api/vectors.md
@ -347,14 +347,14 @@ supported for `floret` mode.
 > most_similar = nlp.vocab.vectors.most_similar(queries, n=10)
 > ```
-| Name           | Description                                                                 |
+| Name           | Description                                                                                                             |
-| -------------- | --------------------------------------------------------------------------- | ----------------------------------------------------------------------------------------------------------------------- |
+| -------------- | ----------------------------------------------------------------------------------------------------------------------- |
-| `queries`      | An array with one or more vectors. ~~numpy.ndarray~~                        |
+| `queries`      | An array with one or more vectors. ~~numpy.ndarray~~                                                                    |
-| _keyword-only_ |                                                                             |
+| _keyword-only_ |                                                                                                                         |
-| `batch_size`   | The batch size to use. Default to `1024`. ~~int~~                           |
+| `batch_size`   | The batch size to use. Default to `1024`. ~~int~~                                                                       |
-| `n`            | The number of entries to return for each query. Defaults to `1`. ~~int~~    |
+| `n`            | The number of entries to return for each query. Defaults to `1`. ~~int~~                                                |
-| `sort`         | Whether to sort the entries returned by score. Defaults to `True`. ~~bool~~ |
+| `sort`         | Whether to sort the entries returned by score. Defaults to `True`. ~~bool~~                                             |
-| **RETURNS**    | tuple                                                                       | The most similar entries as a `(keys, best_rows, scores)` tuple. ~~Tuple[numpy.ndarray, numpy.ndarray, numpy.ndarray]~~ |
+| **RETURNS**    | The most similar entries as a `(keys, best_rows, scores)` tuple. ~~Tuple[numpy.ndarray, numpy.ndarray, numpy.ndarray]~~ |
 ## Vectors.get_batch {#get_batch tag="method" new="3.2"}
--- a/website/docs/images/pipeline-design.svg
+++ b/website/docs/images/pipeline-design.svg
--- a/website/docs/models/index.md
+++ b/website/docs/models/index.md
@ -30,10 +30,16 @@ into three components:
   tagging, parsing, lemmatization and named entity recognition, or `dep` for
   only tagging, parsing and lemmatization).
 2. **Genre:** Type of text the pipeline is trained on, e.g. `web` or `news`.
-3. **Size:** Package size indicator, `sm`, `md`, `lg` or `trf` (`sm`: no word
+3. **Size:** Package size indicator, `sm`, `md`, `lg` or `trf`.
-   vectors, `md`: reduced word vector table with 20k unique vectors for ~500k
+
-   words, `lg`: large word vector table with ~500k entries, `trf`: transformer
+   `sm` and `trf` pipelines have no static word vectors.
-   pipeline without static word vectors)
+
   For pipelines with default vectors, `md` has a reduced word vector table with
   20k unique vectors for ~500k words and `lg` has a large word vector table
   with ~500k entries.
   For pipelines with floret vectors, `md` vector tables have 50k entries and
   `lg` vector tables have 200k entries.
 For example, [`en_core_web_sm`](/models/en#en_core_web_sm) is a small English
 pipeline trained on written web text (blogs, news, comments), that includes
@ -90,19 +96,42 @@ Main changes from spaCy v2 models:
 In the `sm`/`md`/`lg` models:
 - The `tagger`, `morphologizer` and `parser` components listen to the `tok2vec`
-  component.
+  component. If the lemmatizer is trainable (v3.3+), `lemmatizer` also listens
  to `tok2vec`.
 - The `attribute_ruler` maps `token.tag` to `token.pos` if there is no
  `morphologizer`. The `attribute_ruler` additionally makes sure whitespace is
  tagged consistently and copies `token.pos` to `token.tag` if there is no
  tagger. For English, the attribute ruler can improve its mapping from
  `token.tag` to `token.pos` if dependency parses from a `parser` are present,
  but the parser is not required.
- The `lemmatizer` component for many languages (Catalan, Dutch, English,
+- The `lemmatizer` component for many languages requires `token.pos` annotation
-  French, Greek, Italian Macedonian, Norwegian, Polish and Spanish) requires
+  from either `tagger`+`attribute_ruler` or `morphologizer`.
  `token.pos` annotation from either `tagger`+`attribute_ruler` or
  `morphologizer`.
 - The `ner` component is independent with its own internal tok2vec layer.
 #### CNN/CPU pipelines with floret vectors
 The Finnish, Korean and Swedish `md` and `lg` pipelines use
 [floret vectors](/usage/v3-2#vectors) instead of default vectors. If you're
 running a trained pipeline on texts and working with [`Doc`](/api/doc) objects,
 you shouldn't notice any difference with floret vectors. With floret vectors no
 tokens are out-of-vocabulary, so [`Token.is_oov`](/api/token#attributes) will
 return `True` for all tokens.
 If you access vectors directly for similarity comparisons, there are a few
 differences because floret vectors don't include a fixed word list like the
 vector keys for default vectors.
 - If your workflow iterates over the vector keys, you need to use an external
  word list instead:
  ```diff
  - lexemes = [nlp.vocab[orth] for orth in nlp.vocab.vectors]
  + lexemes = [nlp.vocab[word] for word in external_word_list]
  ```
 - [`Vectors.most_similar`](/api/vectors#most_similar) is not supported because
  there's no fixed list of vectors to compare your vectors to.
 ### Transformer pipeline design {#design-trf}
 In the transformer (`trf`) models, the `tagger`, `parser` and `ner` (if present)
@ -133,10 +162,14 @@ nlp = spacy.load("en_core_web_trf", disable=["tagger", "attribute_ruler", "lemma
 <Infobox variant="warning" title="Rule-based and POS-lookup lemmatizers require
 Token.pos">
-The lemmatizer depends on `tagger`+`attribute_ruler` or `morphologizer` for
+The lemmatizer depends on `tagger`+`attribute_ruler` or `morphologizer` for a
-Catalan, Dutch, English, French, Greek, Italian, Macedonian, Norwegian, Polish
+number of languages. If you disable any of these components, you'll see
-and Spanish. If you disable any of these components, you'll see lemmatizer
+lemmatizer warnings unless the lemmatizer is also disabled.
-warnings unless the lemmatizer is also disabled.
+
 **v3.3**: Catalan, English, French, Russian and Spanish
 **v3.0-v3.2**: Catalan, Dutch, English, French, Greek, Italian, Macedonian,
 Norwegian, Polish, Russian and Spanish
 </Infobox>
@ -154,10 +187,34 @@ nlp.enable_pipe("senter")
 The `senter` component is ~10&times; faster than the parser and more accurate
 than the rule-based `sentencizer`.
 #### Switch from trainable lemmatizer to default lemmatizer
 Since v3.3, a number of pipelines use a trainable lemmatizer. You can check whether
 the lemmatizer is trainable:
 ```python
 nlp = spacy.load("de_core_web_sm")
 assert nlp.get_pipe("lemmatizer").is_trainable
 ```
 If you'd like to switch to a non-trainable lemmatizer that's similar to v3.2 or
 earlier, you can replace the trainable lemmatizer with the default non-trainable
 lemmatizer:
 ```python
 # Requirements: pip install spacy-lookups-data
 nlp = spacy.load("de_core_web_sm")
 # Remove existing lemmatizer
 nlp.remove_pipe("lemmatizer")
 # Add non-trainable lemmatizer from language defaults
 # and load lemmatizer tables from spacy-lookups-data
 nlp.add_pipe("lemmatizer").initialize()
 ```
 #### Switch from rule-based to lookup lemmatization
 For the Dutch, English, French, Greek, Macedonian, Norwegian and Spanish
-pipelines, you can switch from the default rule-based lemmatizer to a lookup
+pipelines, you can swap out a trainable or rule-based lemmatizer for a lookup
 lemmatizer:
 ```python
--- a/website/docs/usage/v3-3.md
+++ b/website/docs/usage/v3-3.md
@ -0,0 +1,247 @@
 ---
 title: What's New in v3.3
 teaser: New features and how to upgrade
 menu:
  - ['New Features', 'features']
  - ['Upgrading Notes', 'upgrading']
 ---
 ## New features {#features hidden="true"}
 spaCy v3.3 improves the speed of core pipeline components, adds a new trainable
 lemmatizer, and introduces trained pipelines for Finnish, Korean and Swedish.
 ### Speed improvements {#speed}
 v3.3 includes a slew of speed improvements:
 - Speed up parser and NER by using constant-time head lookups.
 - Support unnormalized softmax probabilities in `spacy.Tagger.v2` to speed up
  inference for tagger, morphologizer, senter and trainable lemmatizer.
 - Speed up parser projectivization functions.
 - Replace `Ragged` with faster `AlignmentArray` in `Example` for training.
 - Improve `Matcher` speed.
 - Improve serialization speed for empty `Doc.spans`.
 For longer texts, the trained pipeline speeds improve **15%** or more in
 prediction. We benchmarked `en_core_web_md` (same components as in v3.2) and
 `de_core_news_md` (with the new trainable lemmatizer) across a range of text
 sizes on Linux (Intel Xeon W-2265) and OS X (M1) to compare spaCy v3.2 vs. v3.3:
 **Intel Xeon W-2265**
 | Model                                            | Avg. Words/Doc | v3.2 Words/Sec | v3.3 Words/Sec |   Diff |
 | :----------------------------------------------- | -------------: | -------------: | -------------: | -----: |
 | [`en_core_web_md`](/models/en#en_core_web_md)    |            100 |          17292 |          17441 |  0.86% |
 | (=same components)                               |           1000 |          15408 |          16024 |  4.00% |
 |                                                  |          10000 |          12798 |          15346 | 19.91% |
 | [`de_core_news_md`](/models/de/#de_core_news_md) |            100 |          20221 |          19321 | -4.45% |
 | (+v3.3 trainable lemmatizer)                     |           1000 |          17480 |          17345 | -0.77% |
 |                                                  |          10000 |          14513 |          17036 | 17.38% |
 **Apple M1**
 | Model                                            | Avg. Words/Doc | v3.2 Words/Sec | v3.3 Words/Sec |   Diff |
 | ------------------------------------------------ | -------------: | -------------: | -------------: | -----: |
 | [`en_core_web_md`](/models/en#en_core_web_md)    |            100 |          18272 |          18408 |  0.74% |
 | (=same components)                               |           1000 |          18794 |          19248 |  2.42% |
 |                                                  |          10000 |          15144 |          17513 | 15.64% |
 | [`de_core_news_md`](/models/de/#de_core_news_md) |            100 |          19227 |          19591 |  1.89% |
 | (+v3.3 trainable lemmatizer)                     |           1000 |          20047 |          20628 |  2.90% |
 |                                                  |          10000 |          15921 |          18546 | 16.49% |
 ### Trainable lemmatizer {#trainable-lemmatizer}
 The new [trainable lemmatizer](/api/edittreelemmatizer) component uses
 [edit trees](https://explosion.ai/blog/edit-tree-lemmatizer) to transform tokens
 into lemmas. Try out the trainable lemmatizer with the
 [training quickstart](/usage/training#quickstart)!
 ### displaCy support for overlapping spans and arcs {#displacy}
 displaCy now supports overlapping spans with a new
 [`span`](/usage/visualizers#span) style and multiple arcs with different labels
 between the same tokens for [`dep`](/usage/visualizers#dep) visualizations.
 Overlapping spans can be visualized for any spans key in `doc.spans`:
 ```python
 import spacy
 from spacy import displacy
 from spacy.tokens import Span
 nlp = spacy.blank("en")
 text = "Welcome to the Bank of China."
 doc = nlp(text)
 doc.spans["custom"] = [Span(doc, 3, 6, "ORG"), Span(doc, 5, 6, "GPE")]
 displacy.serve(doc, style="span", options={"spans_key": "custom"})
 ```
 import DisplacySpanHtml from 'images/displacy-span.html'
 <Iframe title="displaCy visualizer for overlapping spans" html={DisplacySpanHtml} height={180} />
 ## Additional features and improvements
 - Config comparisons with [`spacy debug diff-config`](/api/cli#debug-diff).
 - Span suggester debugging with
  [`SpanCategorizer.set_candidates`](/api/spancategorizer#set_candidates).
 - Big endian support with
  [`thinc-bigendian-ops`](https://github.com/andrewsi-z/thinc-bigendian-ops) and
  updates to make `floret`, `murmurhash`, Thinc and spaCy endian neutral.
 - Initial support for Lower Sorbian and Upper Sorbian.
 - Language updates for English, French, Italian, Japanese, Korean, Norwegian,
  Russian, Slovenian, Spanish, Turkish, Ukrainian and Vietnamese.
 - New noun chunks for Finnish.
 ## Trained pipelines {#pipelines}
 ### New trained pipelines {#new-pipelines}
 v3.3 introduces new CPU/CNN pipelines for Finnish, Korean and Swedish, which use
 the new trainable lemmatizer and
 [floret vectors](https://github.com/explosion/floret). Due to the use
 [Bloom embeddings](https://explosion.ai/blog/bloom-embeddings) and subwords, the
 pipelines have compact vectors with no out-of-vocabulary words.
 | Package                                         | Language | UPOS | Parser LAS | NER F |
 | ----------------------------------------------- | -------- | ---: | ---------: | ----: |
 | [`fi_core_news_sm`](/models/fi#fi_core_news_sm) | Finnish  | 92.5 |       71.9 |  75.9 |
 | [`fi_core_news_md`](/models/fi#fi_core_news_md) | Finnish  | 95.9 |       78.6 |  80.6 |
 | [`fi_core_news_lg`](/models/fi#fi_core_news_lg) | Finnish  | 96.2 |       79.4 |  82.4 |
 | [`ko_core_news_sm`](/models/ko#ko_core_news_sm) | Korean   | 86.1 |       65.6 |  71.3 |
 | [`ko_core_news_md`](/models/ko#ko_core_news_md) | Korean   | 94.7 |       80.9 |  83.1 |
 | [`ko_core_news_lg`](/models/ko#ko_core_news_lg) | Korean   | 94.7 |       81.3 |  85.3 |
 | [`sv_core_news_sm`](/models/sv#sv_core_news_sm) | Swedish  | 95.0 |       75.9 |  74.7 |
 | [`sv_core_news_md`](/models/sv#sv_core_news_md) | Swedish  | 96.3 |       78.5 |  79.3 |
 | [`sv_core_news_lg`](/models/sv#sv_core_news_lg) | Swedish  | 96.3 |       79.1 |  81.1 |
 ### Pipeline updates {#pipeline-updates}
 The following languages switch from lookup or rule-based lemmatizers to the new
 trainable lemmatizer: Danish, Dutch, German, Greek, Italian, Lithuanian,
 Norwegian, Polish, Portuguese and Romanian. The overall lemmatizer accuracy
 improves for all of these pipelines, but be aware that the types of errors may
 look quite different from the lookup-based lemmatizers. If you'd prefer to
 continue using the previous lemmatizer, you can
 [switch from the trainable lemmatizer to a non-trainable lemmatizer](/models#design-modify).
 <figure>
 | Model                                           | v3.2 Lemma Acc | v3.3 Lemma Acc |
 | ----------------------------------------------- | -------------: | -------------: |
 | [`da_core_news_md`](/models/da#da_core_news_md) |           84.9 |           94.8 |
 | [`de_core_news_md`](/models/de#de_core_news_md) |           73.4 |           97.7 |
 | [`el_core_news_md`](/models/el#el_core_news_md) |           56.5 |           88.9 |
 | [`fi_core_news_md`](/models/fi#fi_core_news_md) |              - |           86.2 |
 | [`it_core_news_md`](/models/it#it_core_news_md) |           86.6 |           97.2 |
 | [`ko_core_news_md`](/models/ko#ko_core_news_md) |              - |           90.0 |
 | [`lt_core_news_md`](/models/lt#lt_core_news_md) |           71.1 |           84.8 |
 | [`nb_core_news_md`](/models/nb#nb_core_news_md) |           76.7 |           97.1 |
 | [`nl_core_news_md`](/models/nl#nl_core_news_md) |           81.5 |           94.0 |
 | [`pl_core_news_md`](/models/pl#pl_core_news_md) |           87.1 |           93.7 |
 | [`pt_core_news_md`](/models/pt#pt_core_news_md) |           76.7 |           96.9 |
 | [`ro_core_news_md`](/models/ro#ro_core_news_md) |           81.8 |           95.5 |
 | [`sv_core_news_md`](/models/sv#sv_core_news_md) |              - |           95.5 |
 </figure>
 In addition, the vectors in the English pipelines are deduplicated to improve
 the pruned vectors in the `md` models and reduce the `lg` model size.
 ## Notes about upgrading from v3.2 {#upgrading}
 ### Span comparisons
 Span comparisons involving ordering (`<`, `<=`, `>`, `>=`) now take all span
 attributes into account (start, end, label, and KB ID) so spans may be sorted in
 a slightly different order.
 ### Whitespace annotation
 During training, annotation on whitespace tokens is handled in the same way as
 annotation on non-whitespace tokens in order to allow custom whitespace
 annotation.
 ### Doc.from_docs
 [`Doc.from_docs`](/api/doc#from_docs) now includes `Doc.tensor` by default and
 supports excludes with an `exclude` argument in the same format as
 `Doc.to_bytes`. The supported exclude fields are `spans`, `tensor` and
 `user_data`.
 Docs including `Doc.tensor` may be quite a bit larger in RAM, so to exclude
 `Doc.tensor` as in v3.2:
 ```diff
 -merged_doc = Doc.from_docs(docs)
 +merged_doc = Doc.from_docs(docs, exclude=["tensor"])
 ```
 ### Using trained pipelines with floret vectors
 If you're running a new trained pipeline for Finnish, Korean or Swedish on new
 texts and working with `Doc` objects, you shouldn't notice any difference with
 floret vectors vs. default vectors.
 If you use vectors for similarity comparisons, there are a few differences,
 mainly because a floret pipeline doesn't include any kind of frequency-based
 word list similar to the list of in-vocabulary vector keys with default vectors.
 - If your workflow iterates over the vector keys, you should use an external
  word list instead:
  ```diff
  - lexemes = [nlp.vocab[orth] for orth in nlp.vocab.vectors]
  + lexemes = [nlp.vocab[word] for word in external_word_list]
  ```
 - `Vectors.most_similar` is not supported because there's no fixed list of
  vectors to compare your vectors to.
 ### Pipeline package version compatibility {#version-compat}
 > #### Using legacy implementations
 >
 > In spaCy v3, you'll still be able to load and reference legacy implementations
 > via [`spacy-legacy`](https://github.com/explosion/spacy-legacy), even if the
 > components or architectures change and newer versions are available in the
 > core library.
 When you're loading a pipeline package trained with an earlier version of spaCy
 v3, you will see a warning telling you that the pipeline may be incompatible.
 This doesn't necessarily have to be true, but we recommend running your
 pipelines against your test suite or evaluation data to make sure there are no
 unexpected results.
 If you're using one of the [trained pipelines](/models) we provide, you should
 run [`spacy download`](/api/cli#download) to update to the latest version. To
 see an overview of all installed packages and their compatibility, you can run
 [`spacy validate`](/api/cli#validate).
 If you've trained your own custom pipeline and you've confirmed that it's still
 working as expected, you can update the spaCy version requirements in the
 [`meta.json`](/api/data-formats#meta):
 ```diff
 - "spacy_version": ">=3.2.0,<3.3.0",
 + "spacy_version": ">=3.2.0,<3.4.0",
 ```
 ### Updating v3.2 configs
 To update a config from spaCy v3.2 with the new v3.3 settings, run
 [`init fill-config`](/api/cli#init-fill-config):
 ```cli
 $ python -m spacy init fill-config config-v3.2.cfg config-v3.3.cfg
 ```
 In many cases ([`spacy train`](/api/cli#train),
 [`spacy.load`](/api/top-level#spacy.load)), the new defaults will be filled in
 automatically, but you'll need to fill in the new settings to run
 [`debug config`](/api/cli#debug) and [`debug data`](/api/cli#debug-data).
 To see the speed improvements for the
 [`Tagger` architecture](/api/architectures#Tagger), edit your config to switch
 from `spacy.Tagger.v1` to `spacy.Tagger.v2` and then run `init fill-config`.
--- a/website/docs/usage/visualizers.md
+++ b/website/docs/usage/visualizers.md
@ -5,6 +5,7 @@ new: 2
 menu:
  - ['Dependencies', 'dep']
  - ['Named Entities', 'ent']
  - ['Spans', 'span']
  - ['Jupyter Notebooks', 'jupyter']
  - ['Rendering HTML', 'html']
  - ['Web app usage', 'webapp']
@ -192,7 +193,7 @@ displacy.serve(doc, style="span")
 import DisplacySpanHtml from 'images/displacy-span.html'
-<Iframe title="displaCy visualizer for entities" html={DisplacySpanHtml} height={180} />
+<Iframe title="displaCy visualizer for overlapping spans" html={DisplacySpanHtml} height={180} />
 The span visualizer lets you customize the following `options`:
--- a/website/meta/languages.json
+++ b/website/meta/languages.json
@ -62,6 +62,11 @@
            "example": "Dies ist ein Satz.",
            "has_examples": true
        },
        {
            "code": "dsb",
            "name": "Lower Sorbian",
 	    "has_examples": true
        },
        {
            "code": "el",
            "name": "Greek",
@ -159,6 +164,11 @@
            "name": "Croatian",
            "has_examples": true
        },
        {
            "code": "hsb",
            "name": "Upper Sorbian",
 	    "has_examples": true
        },
        {
            "code": "hu",
            "name": "Hungarian",
--- a/website/meta/sidebars.json
+++ b/website/meta/sidebars.json
@ -11,7 +11,8 @@
                    { "text": "spaCy 101", "url": "/usage/spacy-101" },
                    { "text": "New in v3.0", "url": "/usage/v3" },
                    { "text": "New in v3.1", "url": "/usage/v3-1" },
-                    { "text": "New in v3.2", "url": "/usage/v3-2" }
+                    { "text": "New in v3.2", "url": "/usage/v3-2" },
                    { "text": "New in v3.3", "url": "/usage/v3-3" }
                ]
            },
            {
--- a/website/src/templates/index.js
+++ b/website/src/templates/index.js
@ -120,8 +120,8 @@ const AlertSpace = ({ nightly, legacy }) => {
 }
 const navAlert = (
-    <Link to="/usage/v3-2" hidden>
+    <Link to="/usage/v3-3" hidden>
-        <strong>💥 Out now:</strong> spaCy v3.2
+        <strong>💥 Out now:</strong> spaCy v3.3
    </Link>
 )