Add note on merging speed in v2.1 (see #3300) [ci skip]

2025-10-24 04:31:17 +03:00 · 2019-02-21 12:34:18 +01:00 · 2019-02-21 12:34:18 +01:00 · 0fc908d7a5
commit 0fc908d7a5
parent 236aa94ded
1 changed files with 16 additions and 0 deletions
--- a/website/docs/usage/v2-1.md
+++ b/website/docs/usage/v2-1.md
@ -215,6 +215,22 @@ if all of your models are up to date, you can run the
  means that the `Matcher` in v2.1.x may produce different results compared to
  the `Matcher` in v2.0.x.
 - The deprecated [`Doc.merge`](/api/doc#merge) and
  [`Span.merge`](/api/span#merge) methods still work, but you may notice that
  they now run slower when merging many objects in a row. That's because the
  merging engine was rewritten to be more reliable and to support more efficient
  merging **in bulk**. To take advantage of this, you should rewrite your logic
  to use the [`Doc.retokenize`](/api/doc#retokenize) context manager and perform
  as many merges as possible together in the `with` block.
  ```diff
  - doc[1:5].merge()
  - doc[6:8].merge()
  + with doc.retokenize() as retokenizer:
  +     retokenizer.merge(doc[1:5])
  +     retokenizer.merge(doc[6:8])
  ```
 - For better compatibility with the Universal Dependencies data, the lemmatizer
  now preserves capitalization, e.g. for proper nouns. See
  [this issue](https://github.com/explosion/spaCy/issues/3256) for details.