--- title: What's New in v3.3 teaser: New features and how to upgrade menu: - ['New Features', 'features'] - ['Upgrading Notes', 'upgrading'] --- ## New features {id="features",hidden="true"} spaCy v3.3 improves the speed of core pipeline components, adds a new trainable lemmatizer, and introduces trained pipelines for Finnish, Korean and Swedish. ### Speed improvements {id="speed"} v3.3 includes a slew of speed improvements: - Speed up parser and NER by using constant-time head lookups. - Support unnormalized softmax probabilities in `spacy.Tagger.v2` to speed up inference for tagger, morphologizer, senter and trainable lemmatizer. - Speed up parser projectivization functions. - Replace `Ragged` with faster `AlignmentArray` in `Example` for training. - Improve `Matcher` speed. - Improve serialization speed for empty `Doc.spans`. For longer texts, the trained pipeline speeds improve **15%** or more in prediction. We benchmarked `en_core_web_md` (same components as in v3.2) and `de_core_news_md` (with the new trainable lemmatizer) across a range of text sizes on Linux (Intel Xeon W-2265) and OS X (M1) to compare spaCy v3.2 vs. v3.3: **Intel Xeon W-2265** | Model | Avg. Words/Doc | v3.2 Words/Sec | v3.3 Words/Sec | Diff | | :----------------------------------------------- | -------------: | -------------: | -------------: | -----: | | [`en_core_web_md`](/models/en#en_core_web_md) | 100 | 17292 | 17441 | 0.86% | | (=same components) | 1000 | 15408 | 16024 | 4.00% | | | 10000 | 12798 | 15346 | 19.91% | | [`de_core_news_md`](/models/de/#de_core_news_md) | 100 | 20221 | 19321 | -4.45% | | (+v3.3 trainable lemmatizer) | 1000 | 17480 | 17345 | -0.77% | | | 10000 | 14513 | 17036 | 17.38% | **Apple M1** | Model | Avg. Words/Doc | v3.2 Words/Sec | v3.3 Words/Sec | Diff | | ------------------------------------------------ | -------------: | -------------: | -------------: | -----: | | [`en_core_web_md`](/models/en#en_core_web_md) | 100 | 18272 | 18408 | 0.74% | | (=same components) | 1000 | 18794 | 19248 | 2.42% | | | 10000 | 15144 | 17513 | 15.64% | | [`de_core_news_md`](/models/de/#de_core_news_md) | 100 | 19227 | 19591 | 1.89% | | (+v3.3 trainable lemmatizer) | 1000 | 20047 | 20628 | 2.90% | | | 10000 | 15921 | 18546 | 16.49% | ### Trainable lemmatizer {id="trainable-lemmatizer"} The new [trainable lemmatizer](/api/edittreelemmatizer) component uses [edit trees](https://explosion.ai/blog/edit-tree-lemmatizer) to transform tokens into lemmas. Try out the trainable lemmatizer with the [training quickstart](/usage/training#quickstart)! ### displaCy support for overlapping spans and arcs {id="displacy"} displaCy now supports overlapping spans with a new [`span`](/usage/visualizers#span) style and multiple arcs with different labels between the same tokens for [`dep`](/usage/visualizers#dep) visualizations. Overlapping spans can be visualized for any spans key in `doc.spans`: ```python import spacy from spacy import displacy from spacy.tokens import Span nlp = spacy.blank("en") text = "Welcome to the Bank of China." doc = nlp(text) doc.spans["custom"] = [Span(doc, 3, 6, "ORG"), Span(doc, 5, 6, "GPE")] displacy.serve(doc, style="span", options={"spans_key": "custom"}) ```