diff --git a/.github/contributors/Nuccy90.md b/.github/contributors/Nuccy90.md
new file mode 100644
index 000000000..2d1adb825
--- /dev/null
+++ b/.github/contributors/Nuccy90.md
@@ -0,0 +1,106 @@
+# spaCy contributor agreement
+
+This spaCy Contributor Agreement (**"SCA"**) is based on the
+[Oracle Contributor Agreement](http://www.oracle.com/technetwork/oca-405177.pdf).
+The SCA applies to any contribution that you make to any product or project
+managed by us (the **"project"**), and sets out the intellectual property rights
+you grant to us in the contributed materials. The term **"us"** shall mean
+[ExplosionAI GmbH](https://explosion.ai/legal). The term
+**"you"** shall mean the person or entity identified below.
+
+If you agree to be bound by these terms, fill in the information requested
+below and include the filled-in version with your first pull request, under the
+folder [`.github/contributors/`](/.github/contributors/). The name of the file
+should be your GitHub username, with the extension `.md`. For example, the user
+example_user would create the file `.github/contributors/example_user.md`.
+
+Read this agreement carefully before signing. These terms and conditions
+constitute a binding legal agreement.
+
+## Contributor Agreement
+
+1. The term "contribution" or "contributed materials" means any source code,
+object code, patch, tool, sample, graphic, specification, manual,
+documentation, or any other material posted or submitted by you to the project.
+
+2. With respect to any worldwide copyrights, or copyright applications and
+registrations, in your contribution:
+
+    * you hereby assign to us joint ownership, and to the extent that such
+    assignment is or becomes invalid, ineffective or unenforceable, you hereby
+    grant to us a perpetual, irrevocable, non-exclusive, worldwide, no-charge,
+    royalty-free, unrestricted license to exercise all rights under those
+    copyrights. This includes, at our option, the right to sublicense these same
+    rights to third parties through multiple levels of sublicensees or other
+    licensing arrangements;
+
+    * you agree that each of us can do all things in relation to your
+    contribution as if each of us were the sole owners, and if one of us makes
+    a derivative work of your contribution, the one who makes the derivative
+    work (or has it made will be the sole owner of that derivative work;
+
+    * you agree that you will not assert any moral rights in your contribution
+    against us, our licensees or transferees;
+
+    * you agree that we may register a copyright in your contribution and
+    exercise all ownership rights associated with it; and
+
+    * you agree that neither of us has any duty to consult with, obtain the
+    consent of, pay or render an accounting to the other for any use or
+    distribution of your contribution.
+
+3. With respect to any patents you own, or that you can license without payment
+to any third party, you hereby grant to us a perpetual, irrevocable,
+non-exclusive, worldwide, no-charge, royalty-free license to:
+
+    * make, have made, use, sell, offer to sell, import, and otherwise transfer
+    your contribution in whole or in part, alone or in combination with or
+    included in any product, work or materials arising out of the project to
+    which your contribution was submitted, and
+
+    * at our option, to sublicense these same rights to third parties through
+    multiple levels of sublicensees or other licensing arrangements.
+
+4. Except as set out above, you keep all right, title, and interest in your
+contribution. The rights that you grant to us under these terms are effective
+on the date you first submitted a contribution to us, even if your submission
+took place before the date you sign these terms.
+
+5. You covenant, represent, warrant and agree that:
+
+    * Each contribution that you submit is and shall be an original work of
+    authorship and you can legally grant the rights set out in this SCA;
+
+    * to the best of your knowledge, each contribution will not violate any
+    third party's copyrights, trademarks, patents, or other intellectual
+    property rights; and
+
+    * each contribution shall be in compliance with U.S. export control laws and
+    other applicable export and import laws. You agree to notify us if you
+    become aware of any circumstance which would make any of the foregoing
+    representations inaccurate in any respect. We may publicly disclose your
+    participation in the project, including the fact that you have signed the SCA.
+
+6. This SCA is governed by the laws of the State of California and applicable
+U.S. Federal law. Any choice of law rules will not apply.
+
+7. Please place an “x” on one of the applicable statement below. Please do NOT
+mark both statements:
+
+    * [x] I am signing on behalf of myself as an individual and no other person
+    or entity, including my employer, has or will have rights with respect to my
+    contributions.
+
+    * [ ] I am signing on behalf of my employer or a legal entity and I have the
+    actual authority to contractually bind that entity.
+
+## Contributor Details
+
+| Field                          | Entry                |
+|------------------------------- | -------------------- |
+| Name                           | Elena Fano           |
+| Company name (if applicable)   |                      |
+| Title or role (if applicable)  |                      |
+| Date                           | 2020-09-21           |
+| GitHub username                | Nuccy90              |
+| Website (optional)             |                      |
diff --git a/.github/contributors/rahul1990gupta.md b/.github/contributors/rahul1990gupta.md
new file mode 100644
index 000000000..eab41b3b1
--- /dev/null
+++ b/.github/contributors/rahul1990gupta.md
@@ -0,0 +1,106 @@
+# spaCy contributor agreement
+
+This spaCy Contributor Agreement (**"SCA"**) is based on the
+[Oracle Contributor Agreement](http://www.oracle.com/technetwork/oca-405177.pdf).
+The SCA applies to any contribution that you make to any product or project
+managed by us (the **"project"**), and sets out the intellectual property rights
+you grant to us in the contributed materials. The term **"us"** shall mean
+[ExplosionAI GmbH](https://explosion.ai/legal). The term
+**"you"** shall mean the person or entity identified below.
+
+If you agree to be bound by these terms, fill in the information requested
+below and include the filled-in version with your first pull request, under the
+folder [`.github/contributors/`](/.github/contributors/). The name of the file
+should be your GitHub username, with the extension `.md`. For example, the user
+example_user would create the file `.github/contributors/example_user.md`.
+
+Read this agreement carefully before signing. These terms and conditions
+constitute a binding legal agreement.
+
+## Contributor Agreement
+
+1. The term "contribution" or "contributed materials" means any source code,
+object code, patch, tool, sample, graphic, specification, manual,
+documentation, or any other material posted or submitted by you to the project.
+
+2. With respect to any worldwide copyrights, or copyright applications and
+registrations, in your contribution:
+
+    * you hereby assign to us joint ownership, and to the extent that such
+    assignment is or becomes invalid, ineffective or unenforceable, you hereby
+    grant to us a perpetual, irrevocable, non-exclusive, worldwide, no-charge,
+    royalty-free, unrestricted license to exercise all rights under those
+    copyrights. This includes, at our option, the right to sublicense these same
+    rights to third parties through multiple levels of sublicensees or other
+    licensing arrangements;
+
+    * you agree that each of us can do all things in relation to your
+    contribution as if each of us were the sole owners, and if one of us makes
+    a derivative work of your contribution, the one who makes the derivative
+    work (or has it made will be the sole owner of that derivative work;
+
+    * you agree that you will not assert any moral rights in your contribution
+    against us, our licensees or transferees;
+
+    * you agree that we may register a copyright in your contribution and
+    exercise all ownership rights associated with it; and
+
+    * you agree that neither of us has any duty to consult with, obtain the
+    consent of, pay or render an accounting to the other for any use or
+    distribution of your contribution.
+
+3. With respect to any patents you own, or that you can license without payment
+to any third party, you hereby grant to us a perpetual, irrevocable,
+non-exclusive, worldwide, no-charge, royalty-free license to:
+
+    * make, have made, use, sell, offer to sell, import, and otherwise transfer
+    your contribution in whole or in part, alone or in combination with or
+    included in any product, work or materials arising out of the project to
+    which your contribution was submitted, and
+
+    * at our option, to sublicense these same rights to third parties through
+    multiple levels of sublicensees or other licensing arrangements.
+
+4. Except as set out above, you keep all right, title, and interest in your
+contribution. The rights that you grant to us under these terms are effective
+on the date you first submitted a contribution to us, even if your submission
+took place before the date you sign these terms.
+
+5. You covenant, represent, warrant and agree that:
+
+    * Each contribution that you submit is and shall be an original work of
+    authorship and you can legally grant the rights set out in this SCA;
+
+    * to the best of your knowledge, each contribution will not violate any
+    third party's copyrights, trademarks, patents, or other intellectual
+    property rights; and
+
+    * each contribution shall be in compliance with U.S. export control laws and
+    other applicable export and import laws. You agree to notify us if you
+    become aware of any circumstance which would make any of the foregoing
+    representations inaccurate in any respect. We may publicly disclose your
+    participation in the project, including the fact that you have signed the SCA.
+
+6. This SCA is governed by the laws of the State of California and applicable
+U.S. Federal law. Any choice of law rules will not apply.
+
+7. Please place an “x” on one of the applicable statement below. Please do NOT
+mark both statements:
+
+    * [x] I am signing on behalf of myself as an individual and no other person
+    or entity, including my employer, has or will have rights with respect to my
+    contributions.
+
+    * [ ] I am signing on behalf of my employer or a legal entity and I have the
+    actual authority to contractually bind that entity.
+
+## Contributor Details
+
+| Field                          | Entry                |
+|------------------------------- | -------------------- |
+| Name                           |  Rahul Gupta         |
+| Company name (if applicable)   |                      |
+| Title or role (if applicable)  |                      |
+| Date                           |  28 July 2020        |
+| GitHub username                |  rahul1990gupta      |
+| Website (optional)             |                      |
diff --git a/spacy/about.py b/spacy/about.py
index 9c5dd0b4f..bf1d53a7b 100644
--- a/spacy/about.py
+++ b/spacy/about.py
@@ -1,6 +1,6 @@
 # fmt: off
 __title__ = "spacy-nightly"
-__version__ = "3.0.0a41"
+__version__ = "3.0.0rc1"
 __download_url__ = "https://github.com/explosion/spacy-models/releases/download"
 __compatibility__ = "https://raw.githubusercontent.com/explosion/spacy-models/master/compatibility.json"
 __projects__ = "https://github.com/explosion/projects"
diff --git a/spacy/lang/hi/lex_attrs.py b/spacy/lang/hi/lex_attrs.py
index 20a8c2975..a18c2e513 100644
--- a/spacy/lang/hi/lex_attrs.py
+++ b/spacy/lang/hi/lex_attrs.py
@@ -10,23 +10,26 @@ _stem_suffixes = [
     ["ाएगी", "ाएगा", "ाओगी", "ाओगे", "एंगी", "ेंगी", "एंगे", "ेंगे", "ूंगी", "ूंगा", "ातीं", "नाओं", "नाएं", "ताओं", "ताएं", "ियाँ", "ियों", "ियां"],
     ["ाएंगी", "ाएंगे", "ाऊंगी", "ाऊंगा", "ाइयाँ", "ाइयों", "ाइयां"]
 ]
-# fmt: on
 
-# reference 1:https://en.wikipedia.org/wiki/Indian_numbering_system
+# reference 1: https://en.wikipedia.org/wiki/Indian_numbering_system
 # reference 2: https://blogs.transparent.com/hindi/hindi-numbers-1-100/
+# reference 3: https://www.mindurhindi.com/basic-words-and-phrases-in-hindi/
 
-_num_words = [
+_one_to_ten = [
     "शून्य",
     "एक",
     "दो",
     "तीन",
     "चार",
-    "पांच",
+    "पांच", "पाँच",
     "छह",
     "सात",
     "आठ",
     "नौ",
     "दस",
+]
+
+_eleven_to_beyond = [
     "ग्यारह",
     "बारह",
     "तेरह",
@@ -37,13 +40,85 @@ _num_words = [
     "अठारह",
     "उन्नीस",
     "बीस",
+    "इकीस", "इक्कीस",
+    "बाईस",
+    "तेइस",
+    "चौबीस",
+    "पच्चीस",
+    "छब्बीस",
+    "सताइस", "सत्ताइस",
+    "अट्ठाइस",
+    "उनतीस",
     "तीस",
+    "इकतीस", "इकत्तीस",
+    "बतीस", "बत्तीस",
+    "तैंतीस",
+    "चौंतीस",
+    "पैंतीस",
+    "छतीस", "छत्तीस",
+    "सैंतीस",
+    "अड़तीस",
+    "उनतालीस", "उनत्तीस",
     "चालीस",
+    "इकतालीस",
+    "बयालीस",
+    "तैतालीस",
+    "चवालीस",
+    "पैंतालीस",
+    "छयालिस",
+    "सैंतालीस",
+    "अड़तालीस",
+    "उनचास",
     "पचास",
+    "इक्यावन",
+    "बावन",
+    "तिरपन", "तिरेपन",
+    "चौवन", "चउवन",
+    "पचपन",
+    "छप्पन",
+    "सतावन", "सत्तावन",
+    "अठावन",
+    "उनसठ",
     "साठ",
+    "इकसठ",
+    "बासठ",
+    "तिरसठ", "तिरेसठ",
+    "चौंसठ",
+    "पैंसठ",
+    "छियासठ",
+    "सड़सठ",
+    "अड़सठ",
+    "उनहत्तर",
     "सत्तर",
+    "इकहत्तर"
+    "बहत्तर",
+    "तिहत्तर",
+    "चौहत्तर",
+    "पचहत्तर",
+    "छिहत्तर",
+    "सतहत्तर",
+    "अठहत्तर",
+    "उन्नासी", "उन्यासी"
     "अस्सी",
+    "इक्यासी",
+    "बयासी",
+    "तिरासी",
+    "चौरासी",
+    "पचासी",
+    "छियासी",
+    "सतासी",
+    "अट्ठासी",
+    "नवासी",
     "नब्बे",
+    "इक्यानवे",
+    "बानवे",
+    "तिरानवे",
+    "चौरानवे",
+    "पचानवे",
+    "छियानवे",
+    "सतानवे",
+    "अट्ठानवे",
+    "निन्यानवे",
     "सौ",
     "हज़ार",
     "लाख",
@@ -52,6 +127,23 @@ _num_words = [
     "खरब",
 ]
 
+_num_words = _one_to_ten + _eleven_to_beyond
+
+_ordinal_words_one_to_ten = [
+    "प्रथम", "पहला",
+    "द्वितीय", "दूसरा",
+    "तृतीय", "तीसरा",
+    "चौथा",
+    "पांचवाँ",
+    "छठा",
+    "सातवाँ",
+    "आठवाँ",
+    "नौवाँ",
+    "दसवाँ",
+]
+_ordinal_suffix = "वाँ"
+# fmt: on
+
 
 def norm(string):
     # normalise base exceptions,  e.g. punctuation or currency symbols
@@ -64,7 +156,7 @@ def norm(string):
     for suffix_group in reversed(_stem_suffixes):
         length = len(suffix_group[0])
         if len(string) <= length:
-            break
+            continue
         for suffix in suffix_group:
             if string.endswith(suffix):
                 return string[:-length]
@@ -74,7 +166,7 @@ def norm(string):
 def like_num(text):
     if text.startswith(("+", "-", "±", "~")):
         text = text[1:]
-    text = text.replace(", ", "").replace(".", "")
+    text = text.replace(",", "").replace(".", "")
     if text.isdigit():
         return True
     if text.count("/") == 1:
@@ -83,6 +175,14 @@ def like_num(text):
             return True
     if text.lower() in _num_words:
         return True
+
+    # check ordinal numbers
+    # reference: http://www.englishkitab.com/Vocabulary/Numbers.html
+    if text in _ordinal_words_one_to_ten:
+        return True
+    if text.endswith(_ordinal_suffix):
+        if text[: -len(_ordinal_suffix)] in _eleven_to_beyond:
+            return True
     return False
 
 
diff --git a/spacy/lang/ta/examples.py b/spacy/lang/ta/examples.py
index c3c47e66e..e68dc6237 100644
--- a/spacy/lang/ta/examples.py
+++ b/spacy/lang/ta/examples.py
@@ -19,4 +19,6 @@ sentences = [
     "தன்னாட்சி கார்கள் காப்பீட்டு பொறுப்பை உற்பத்தியாளரிடம் மாற்றுகின்றன",
     "நடைபாதை விநியோக ரோபோக்களை தடை செய்வதை சான் பிரான்சிஸ்கோ கருதுகிறது",
     "லண்டன் ஐக்கிய இராச்சியத்தில் ஒரு பெரிய நகரம்.",
+    "என்ன வேலை செய்கிறீர்கள்?",
+    "எந்த கல்லூரியில் படிக்கிறாய்?",
 ]
diff --git a/spacy/lang/tr/lex_attrs.py b/spacy/lang/tr/lex_attrs.py
index d9e12c4aa..f7416837d 100644
--- a/spacy/lang/tr/lex_attrs.py
+++ b/spacy/lang/tr/lex_attrs.py
@@ -73,20 +73,16 @@ def like_num(text):
         num, denom = text.split("/")
         if num.isdigit() and denom.isdigit():
             return True
-
     text_lower = text.lower()
-
     # Check cardinal number
     if text_lower in _num_words:
         return True
-
     # Check ordinal number
     if text_lower in _ordinal_words:
         return True
     if text_lower.endswith(_ordinal_endings):
         if text_lower[:-3].isdigit() or text_lower[:-4].isdigit():
             return True
-
     return False
 
 
diff --git a/spacy/lang/tr/syntax_iterators.py b/spacy/lang/tr/syntax_iterators.py
index d9b342949..3fd726fb5 100644
--- a/spacy/lang/tr/syntax_iterators.py
+++ b/spacy/lang/tr/syntax_iterators.py
@@ -1,6 +1,3 @@
-# coding: utf8
-from __future__ import unicode_literals
-
 from ...symbols import NOUN, PROPN, PRON
 from ...errors import Errors
 
diff --git a/spacy/tests/conftest.py b/spacy/tests/conftest.py
index 3b0de899b..3733d345d 100644
--- a/spacy/tests/conftest.py
+++ b/spacy/tests/conftest.py
@@ -125,6 +125,11 @@ def he_tokenizer():
     return get_lang_class("he")().tokenizer
 
 
+@pytest.fixture(scope="session")
+def hi_tokenizer():
+    return get_lang_class("hi")().tokenizer
+
+
 @pytest.fixture(scope="session")
 def hr_tokenizer():
     return get_lang_class("hr")().tokenizer
@@ -240,11 +245,6 @@ def tr_tokenizer():
     return get_lang_class("tr")().tokenizer
 
 
-@pytest.fixture(scope="session")
-def tr_vocab():
-    return get_lang_class("tr").Defaults.create_vocab()
-
-
 @pytest.fixture(scope="session")
 def tt_tokenizer():
     return get_lang_class("tt")().tokenizer
@@ -297,11 +297,7 @@ def zh_tokenizer_pkuseg():
                 "segmenter": "pkuseg",
             }
         },
-        "initialize": {
-            "tokenizer": {
-                "pkuseg_model": "web",
-            }
-        },
+        "initialize": {"tokenizer": {"pkuseg_model": "web"}},
     }
     nlp = get_lang_class("zh").from_config(config)
     nlp.initialize()
diff --git a/spacy/tests/lang/hi/__init__.py b/spacy/tests/lang/hi/__init__.py
new file mode 100644
index 000000000..e69de29bb
diff --git a/spacy/tests/lang/hi/test_lex_attrs.py b/spacy/tests/lang/hi/test_lex_attrs.py
new file mode 100644
index 000000000..80a7cc1c4
--- /dev/null
+++ b/spacy/tests/lang/hi/test_lex_attrs.py
@@ -0,0 +1,43 @@
+import pytest
+from spacy.lang.hi.lex_attrs import norm, like_num
+
+
+def test_hi_tokenizer_handles_long_text(hi_tokenizer):
+    text = """
+ये कहानी 1900 के दशक की है। कौशल्या (स्मिता जयकर) को पता चलता है कि उसका
+छोटा बेटा, देवदास (शाहरुख खान) वापस घर आ रहा है। देवदास 10 साल पहले कानून की
+पढ़ाई करने के लिए इंग्लैंड गया था। उसके लौटने की खुशी में ये बात कौशल्या अपनी पड़ोस
+में रहने वाली सुमित्रा (किरण खेर) को भी बता देती है। इस खबर से वो भी खुश हो जाती है।
+"""
+    tokens = hi_tokenizer(text)
+    assert len(tokens) == 86
+
+
+@pytest.mark.parametrize(
+    "word,word_norm",
+    [
+        ("चलता", "चल"),
+        ("पढ़ाई", "पढ़"),
+        ("देती", "दे"),
+        ("जाती", "ज"),
+        ("मुस्कुराकर", "मुस्कुर"),
+    ],
+)
+def test_hi_norm(word, word_norm):
+    assert norm(word) == word_norm
+
+
+@pytest.mark.parametrize(
+    "word",
+    ["१९८७", "1987", "१२,२६७", "उन्नीस", "पाँच", "नवासी", "५/१०"],
+)
+def test_hi_like_num(word):
+    assert like_num(word)
+
+
+@pytest.mark.parametrize(
+    "word",
+    ["पहला", "तृतीय", "निन्यानवेवाँ", "उन्नीस", "तिहत्तरवाँ", "छत्तीसवाँ"],
+)
+def test_hi_like_num_ordinal_words(word):
+    assert like_num(word)
diff --git a/spacy/tests/parser/test_ner.py b/spacy/tests/parser/test_ner.py
index b657ae2e8..b4c22b48d 100644
--- a/spacy/tests/parser/test_ner.py
+++ b/spacy/tests/parser/test_ner.py
@@ -1,4 +1,7 @@
 import pytest
+from numpy.testing import assert_equal
+from spacy.attrs import ENT_IOB
+
 from spacy import util
 from spacy.lang.en import English
 from spacy.language import Language
@@ -332,6 +335,19 @@ def test_overfitting_IO():
         assert ents2[0].text == "London"
         assert ents2[0].label_ == "LOC"
 
+    # Make sure that running pipe twice, or comparing to call, always amounts to the same predictions
+    texts = [
+        "Just a sentence.",
+        "Then one more sentence about London.",
+        "Here is another one.",
+        "I like London.",
+    ]
+    batch_deps_1 = [doc.to_array([ENT_IOB]) for doc in nlp.pipe(texts)]
+    batch_deps_2 = [doc.to_array([ENT_IOB]) for doc in nlp.pipe(texts)]
+    no_batch_deps = [doc.to_array([ENT_IOB]) for doc in [nlp(text) for text in texts]]
+    assert_equal(batch_deps_1, batch_deps_2)
+    assert_equal(batch_deps_1, no_batch_deps)
+
 
 def test_ner_warns_no_lookups(caplog):
     nlp = English()
diff --git a/spacy/tests/parser/test_parse.py b/spacy/tests/parser/test_parse.py
index ffb6f23f1..a914eb17a 100644
--- a/spacy/tests/parser/test_parse.py
+++ b/spacy/tests/parser/test_parse.py
@@ -1,4 +1,7 @@
 import pytest
+from numpy.testing import assert_equal
+from spacy.attrs import DEP
+
 from spacy.lang.en import English
 from spacy.training import Example
 from spacy.tokens import Doc
@@ -210,3 +213,16 @@ def test_overfitting_IO():
         assert doc2[0].dep_ == "nsubj"
         assert doc2[2].dep_ == "dobj"
         assert doc2[3].dep_ == "punct"
+
+    # Make sure that running pipe twice, or comparing to call, always amounts to the same predictions
+    texts = [
+        "Just a sentence.",
+        "Then one more sentence about London.",
+        "Here is another one.",
+        "I like London.",
+    ]
+    batch_deps_1 = [doc.to_array([DEP]) for doc in nlp.pipe(texts)]
+    batch_deps_2 = [doc.to_array([DEP]) for doc in nlp.pipe(texts)]
+    no_batch_deps = [doc.to_array([DEP]) for doc in [nlp(text) for text in texts]]
+    assert_equal(batch_deps_1, batch_deps_2)
+    assert_equal(batch_deps_1, no_batch_deps)
diff --git a/spacy/tests/pipeline/test_entity_linker.py b/spacy/tests/pipeline/test_entity_linker.py
index f2e6defcb..8ba2d0d3e 100644
--- a/spacy/tests/pipeline/test_entity_linker.py
+++ b/spacy/tests/pipeline/test_entity_linker.py
@@ -1,5 +1,7 @@
 from typing import Callable, Iterable
 import pytest
+from numpy.testing import assert_equal
+from spacy.attrs import ENT_KB_ID
 
 from spacy.kb import KnowledgeBase, get_candidates, Candidate
 from spacy.vocab import Vocab
@@ -496,6 +498,19 @@ def test_overfitting_IO():
                 predictions.append(ent.kb_id_)
         assert predictions == GOLD_entities
 
+    # Make sure that running pipe twice, or comparing to call, always amounts to the same predictions
+    texts = [
+        "Russ Cochran captured his first major title with his son as caddie.",
+        "Russ Cochran his reprints include EC Comics.",
+        "Russ Cochran has been publishing comic art.",
+        "Russ Cochran was a member of University of Kentucky's golf team.",
+    ]
+    batch_deps_1 = [doc.to_array([ENT_KB_ID]) for doc in nlp.pipe(texts)]
+    batch_deps_2 = [doc.to_array([ENT_KB_ID]) for doc in nlp.pipe(texts)]
+    no_batch_deps = [doc.to_array([ENT_KB_ID]) for doc in [nlp(text) for text in texts]]
+    assert_equal(batch_deps_1, batch_deps_2)
+    assert_equal(batch_deps_1, no_batch_deps)
+
 
 def test_kb_serialization():
     # Test that the KB can be used in a pipeline with a different vocab
diff --git a/spacy/tests/pipeline/test_models.py b/spacy/tests/pipeline/test_models.py
new file mode 100644
index 000000000..d04ac9cd4
--- /dev/null
+++ b/spacy/tests/pipeline/test_models.py
@@ -0,0 +1,107 @@
+from typing import List
+
+import numpy
+import pytest
+from numpy.testing import assert_almost_equal
+from spacy.vocab import Vocab
+from thinc.api import NumpyOps, Model, data_validation
+from thinc.types import Array2d, Ragged
+
+from spacy.lang.en import English
+from spacy.ml import FeatureExtractor, StaticVectors
+from spacy.ml._character_embed import CharacterEmbed
+from spacy.tokens import Doc
+
+
+OPS = NumpyOps()
+
+texts = ["These are 4 words", "Here just three"]
+l0 = [[1, 2], [3, 4], [5, 6], [7, 8]]
+l1 = [[9, 8], [7, 6], [5, 4]]
+list_floats = [OPS.xp.asarray(l0, dtype="f"), OPS.xp.asarray(l1, dtype="f")]
+list_ints = [OPS.xp.asarray(l0, dtype="i"), OPS.xp.asarray(l1, dtype="i")]
+array = OPS.xp.asarray(l1, dtype="f")
+ragged = Ragged(array, OPS.xp.asarray([2, 1], dtype="i"))
+
+
+def get_docs():
+    vocab = Vocab()
+    for t in texts:
+        for word in t.split():
+            hash_id = vocab.strings.add(word)
+            vector = numpy.random.uniform(-1, 1, (7,))
+            vocab.set_vector(hash_id, vector)
+    docs = [English(vocab)(t) for t in texts]
+    return docs
+
+
+# Test components with a model of type Model[List[Doc], List[Floats2d]]
+@pytest.mark.parametrize("name", ["tagger", "tok2vec", "morphologizer", "senter"])
+def test_components_batching_list(name):
+    nlp = English()
+    proc = nlp.create_pipe(name)
+    util_batch_unbatch_docs_list(proc.model, get_docs(), list_floats)
+
+
+# Test components with a model of type Model[List[Doc], Floats2d]
+@pytest.mark.parametrize("name", ["textcat"])
+def test_components_batching_array(name):
+    nlp = English()
+    proc = nlp.create_pipe(name)
+    util_batch_unbatch_docs_array(proc.model, get_docs(), array)
+
+
+LAYERS = [
+    (CharacterEmbed(nM=5, nC=3), get_docs(), list_floats),
+    (FeatureExtractor([100, 200]), get_docs(), list_ints),
+    (StaticVectors(), get_docs(), ragged),
+]
+
+
+@pytest.mark.parametrize("model,in_data,out_data", LAYERS)
+def test_layers_batching_all(model, in_data, out_data):
+    # In = List[Doc]
+    if isinstance(in_data, list) and isinstance(in_data[0], Doc):
+        if isinstance(out_data, OPS.xp.ndarray) and out_data.ndim == 2:
+            util_batch_unbatch_docs_array(model, in_data, out_data)
+        elif (
+            isinstance(out_data, list)
+            and isinstance(out_data[0], OPS.xp.ndarray)
+            and out_data[0].ndim == 2
+        ):
+            util_batch_unbatch_docs_list(model, in_data, out_data)
+        elif isinstance(out_data, Ragged):
+            util_batch_unbatch_docs_ragged(model, in_data, out_data)
+
+
+def util_batch_unbatch_docs_list(
+    model: Model[List[Doc], List[Array2d]], in_data: List[Doc], out_data: List[Array2d]
+):
+    with data_validation(True):
+        model.initialize(in_data, out_data)
+        Y_batched = model.predict(in_data)
+        Y_not_batched = [model.predict([u])[0] for u in in_data]
+        for i in range(len(Y_batched)):
+            assert_almost_equal(Y_batched[i], Y_not_batched[i], decimal=4)
+
+
+def util_batch_unbatch_docs_array(
+    model: Model[List[Doc], Array2d], in_data: List[Doc], out_data: Array2d
+):
+    with data_validation(True):
+        model.initialize(in_data, out_data)
+        Y_batched = model.predict(in_data).tolist()
+        Y_not_batched = [model.predict([u])[0] for u in in_data]
+        assert_almost_equal(Y_batched, Y_not_batched, decimal=4)
+
+
+def util_batch_unbatch_docs_ragged(
+    model: Model[List[Doc], Ragged], in_data: List[Doc], out_data: Ragged
+):
+    with data_validation(True):
+        model.initialize(in_data, out_data)
+        Y_batched = model.predict(in_data)
+        Y_not_batched = []
+        for u in in_data:
+            Y_not_batched.extend(model.predict([u]).data.tolist())
+        assert_almost_equal(Y_batched.data, Y_not_batched, decimal=4)
diff --git a/spacy/tests/pipeline/test_morphologizer.py b/spacy/tests/pipeline/test_morphologizer.py
index fd7aa05be..85d1d6c8b 100644
--- a/spacy/tests/pipeline/test_morphologizer.py
+++ b/spacy/tests/pipeline/test_morphologizer.py
@@ -1,4 +1,5 @@
 import pytest
+from numpy.testing import assert_equal
 
 from spacy import util
 from spacy.training import Example
@@ -6,6 +7,7 @@ from spacy.lang.en import English
 from spacy.language import Language
 from spacy.tests.util import make_tempdir
 from spacy.morphology import Morphology
+from spacy.attrs import MORPH
 
 
 def test_label_types():
@@ -101,3 +103,16 @@ def test_overfitting_IO():
         doc2 = nlp2(test_text)
         assert [str(t.morph) for t in doc2] == gold_morphs
         assert [t.pos_ for t in doc2] == gold_pos_tags
+
+    # Make sure that running pipe twice, or comparing to call, always amounts to the same predictions
+    texts = [
+        "Just a sentence.",
+        "Then one more sentence about London.",
+        "Here is another one.",
+        "I like London.",
+    ]
+    batch_deps_1 = [doc.to_array([MORPH]) for doc in nlp.pipe(texts)]
+    batch_deps_2 = [doc.to_array([MORPH]) for doc in nlp.pipe(texts)]
+    no_batch_deps = [doc.to_array([MORPH]) for doc in [nlp(text) for text in texts]]
+    assert_equal(batch_deps_1, batch_deps_2)
+    assert_equal(batch_deps_1, no_batch_deps)
diff --git a/spacy/tests/pipeline/test_senter.py b/spacy/tests/pipeline/test_senter.py
index c9722e5de..7a256f79b 100644
--- a/spacy/tests/pipeline/test_senter.py
+++ b/spacy/tests/pipeline/test_senter.py
@@ -1,4 +1,6 @@
 import pytest
+from numpy.testing import assert_equal
+from spacy.attrs import SENT_START
 
 from spacy import util
 from spacy.training import Example
@@ -80,3 +82,18 @@ def test_overfitting_IO():
         nlp2 = util.load_model_from_path(tmp_dir)
         doc2 = nlp2(test_text)
         assert [int(t.is_sent_start) for t in doc2] == gold_sent_starts
+
+    # Make sure that running pipe twice, or comparing to call, always amounts to the same predictions
+    texts = [
+        "Just a sentence.",
+        "Then one more sentence about London.",
+        "Here is another one.",
+        "I like London.",
+    ]
+    batch_deps_1 = [doc.to_array([SENT_START]) for doc in nlp.pipe(texts)]
+    batch_deps_2 = [doc.to_array([SENT_START]) for doc in nlp.pipe(texts)]
+    no_batch_deps = [
+        doc.to_array([SENT_START]) for doc in [nlp(text) for text in texts]
+    ]
+    assert_equal(batch_deps_1, batch_deps_2)
+    assert_equal(batch_deps_1, no_batch_deps)
diff --git a/spacy/tests/pipeline/test_tagger.py b/spacy/tests/pipeline/test_tagger.py
index b9db76cdf..885bdbce1 100644
--- a/spacy/tests/pipeline/test_tagger.py
+++ b/spacy/tests/pipeline/test_tagger.py
@@ -1,4 +1,7 @@
 import pytest
+from numpy.testing import assert_equal
+from spacy.attrs import TAG
+
 from spacy import util
 from spacy.training import Example
 from spacy.lang.en import English
@@ -117,6 +120,19 @@ def test_overfitting_IO():
         assert doc2[2].tag_ is "J"
         assert doc2[3].tag_ is "N"
 
+    # Make sure that running pipe twice, or comparing to call, always amounts to the same predictions
+    texts = [
+        "Just a sentence.",
+        "I like green eggs.",
+        "Here is another one.",
+        "I eat ham.",
+    ]
+    batch_deps_1 = [doc.to_array([TAG]) for doc in nlp.pipe(texts)]
+    batch_deps_2 = [doc.to_array([TAG]) for doc in nlp.pipe(texts)]
+    no_batch_deps = [doc.to_array([TAG]) for doc in [nlp(text) for text in texts]]
+    assert_equal(batch_deps_1, batch_deps_2)
+    assert_equal(batch_deps_1, no_batch_deps)
+
 
 def test_tagger_requires_labels():
     nlp = English()
diff --git a/spacy/tests/pipeline/test_textcat.py b/spacy/tests/pipeline/test_textcat.py
index dd2f1070b..91348b1b3 100644
--- a/spacy/tests/pipeline/test_textcat.py
+++ b/spacy/tests/pipeline/test_textcat.py
@@ -1,6 +1,7 @@
 import pytest
 import random
 import numpy.random
+from numpy.testing import assert_equal
 from thinc.api import fix_random_seed
 from spacy import util
 from spacy.lang.en import English
@@ -174,6 +175,14 @@ def test_overfitting_IO():
     assert scores["cats_score"] == 1.0
     assert "cats_score_desc" in scores
 
+    # Make sure that running pipe twice, or comparing to call, always amounts to the same predictions
+    texts = ["Just a sentence.", "I like green eggs.", "I am happy.", "I eat ham."]
+    batch_deps_1 = [doc.cats for doc in nlp.pipe(texts)]
+    batch_deps_2 = [doc.cats for doc in nlp.pipe(texts)]
+    no_batch_deps = [doc.cats for doc in [nlp(text) for text in texts]]
+    assert_equal(batch_deps_1, batch_deps_2)
+    assert_equal(batch_deps_1, no_batch_deps)
+
 
 # fmt: off
 @pytest.mark.parametrize(
diff --git a/spacy/tests/regression/test_issue5501-6000.py b/spacy/tests/regression/test_issue5501-6000.py
new file mode 100644
index 000000000..f0b46cb83
--- /dev/null
+++ b/spacy/tests/regression/test_issue5501-6000.py
@@ -0,0 +1,76 @@
+from thinc.api import fix_random_seed
+from spacy.lang.en import English
+from spacy.tokens import Span
+from spacy import displacy
+from spacy.pipeline import merge_entities
+
+
+def test_issue5551():
+    """Test that after fixing the random seed, the results of the pipeline are truly identical"""
+    component = "textcat"
+    pipe_cfg = {
+        "model": {
+            "@architectures": "spacy.TextCatBOW.v1",
+            "exclusive_classes": True,
+            "ngram_size": 2,
+            "no_output_layer": False,
+        }
+    }
+    results = []
+    for i in range(3):
+        fix_random_seed(0)
+        nlp = English()
+        example = (
+            "Once hot, form ping-pong-ball-sized balls of the mixture, each weighing roughly 25 g.",
+            {"cats": {"Labe1": 1.0, "Label2": 0.0, "Label3": 0.0}},
+        )
+        pipe = nlp.add_pipe(component, config=pipe_cfg, last=True)
+        for label in set(example[1]["cats"]):
+            pipe.add_label(label)
+        nlp.initialize()
+        # Store the result of each iteration
+        result = pipe.model.predict([nlp.make_doc(example[0])])
+        results.append(list(result[0]))
+    # All results should be the same because of the fixed seed
+    assert len(results) == 3
+    assert results[0] == results[1]
+    assert results[0] == results[2]
+
+
+def test_issue5838():
+    # Displacy's EntityRenderer break line
+    # not working after last entity
+    sample_text = "First line\nSecond line, with ent\nThird line\nFourth line\n"
+    nlp = English()
+    doc = nlp(sample_text)
+    doc.ents = [Span(doc, 7, 8, label="test")]
+    html = displacy.render(doc, style="ent")
+    found = html.count("</br>")
+    assert found == 4
+
+
+def test_issue5918():
+    # Test edge case when merging entities.
+    nlp = English()
+    ruler = nlp.add_pipe("entity_ruler")
+    patterns = [
+        {"label": "ORG", "pattern": "Digicon Inc"},
+        {"label": "ORG", "pattern": "Rotan Mosle Inc's"},
+        {"label": "ORG", "pattern": "Rotan Mosle Technology Partners Ltd"},
+    ]
+    ruler.add_patterns(patterns)
+
+    text = """
+        Digicon Inc said it has completed the previously-announced disposition
+        of its computer systems division to an investment group led by
+        Rotan Mosle Inc's Rotan Mosle Technology Partners Ltd affiliate.
+        """
+    doc = nlp(text)
+    assert len(doc.ents) == 3
+    # make it so that the third span's head is within the entity (ent_iob=I)
+    # bug #5918 would wrongly transfer that I to the full entity, resulting in 2 instead of 3 final ents.
+    # TODO: test for logging here
+    # with pytest.warns(UserWarning):
+    #     doc[29].head = doc[33]
+    doc = merge_entities(doc)
+    assert len(doc.ents) == 3
diff --git a/spacy/tests/regression/test_issue5551.py b/spacy/tests/regression/test_issue5551.py
deleted file mode 100644
index 655764362..000000000
--- a/spacy/tests/regression/test_issue5551.py
+++ /dev/null
@@ -1,37 +0,0 @@
-from spacy.lang.en import English
-from spacy.util import fix_random_seed
-
-
-def test_issue5551():
-    """Test that after fixing the random seed, the results of the pipeline are truly identical"""
-    component = "textcat"
-    pipe_cfg = {
-        "model": {
-            "@architectures": "spacy.TextCatBOW.v1",
-            "exclusive_classes": True,
-            "ngram_size": 2,
-            "no_output_layer": False,
-        }
-    }
-
-    results = []
-    for i in range(3):
-        fix_random_seed(0)
-        nlp = English()
-        example = (
-            "Once hot, form ping-pong-ball-sized balls of the mixture, each weighing roughly 25 g.",
-            {"cats": {"Labe1": 1.0, "Label2": 0.0, "Label3": 0.0}},
-        )
-        pipe = nlp.add_pipe(component, config=pipe_cfg, last=True)
-        for label in set(example[1]["cats"]):
-            pipe.add_label(label)
-        nlp.initialize()
-
-        # Store the result of each iteration
-        result = pipe.model.predict([nlp.make_doc(example[0])])
-        results.append(list(result[0]))
-
-    # All results should be the same because of the fixed seed
-    assert len(results) == 3
-    assert results[0] == results[1]
-    assert results[0] == results[2]
diff --git a/spacy/tests/regression/test_issue5838.py b/spacy/tests/regression/test_issue5838.py
deleted file mode 100644
index 4e4d98beb..000000000
--- a/spacy/tests/regression/test_issue5838.py
+++ /dev/null
@@ -1,23 +0,0 @@
-from spacy.lang.en import English
-from spacy.tokens import Span
-from spacy import displacy
-
-
-SAMPLE_TEXT = """First line
-Second line, with ent
-Third line
-Fourth line
-"""
-
-
-def test_issue5838():
-    # Displacy's EntityRenderer break line
-    # not working after last entity
-
-    nlp = English()
-    doc = nlp(SAMPLE_TEXT)
-    doc.ents = [Span(doc, 7, 8, label="test")]
-
-    html = displacy.render(doc, style="ent")
-    found = html.count("</br>")
-    assert found == 4
diff --git a/spacy/tests/regression/test_issue5918.py b/spacy/tests/regression/test_issue5918.py
deleted file mode 100644
index d25323ef6..000000000
--- a/spacy/tests/regression/test_issue5918.py
+++ /dev/null
@@ -1,29 +0,0 @@
-from spacy.lang.en import English
-from spacy.pipeline import merge_entities
-
-
-def test_issue5918():
-    # Test edge case when merging entities.
-    nlp = English()
-    ruler = nlp.add_pipe("entity_ruler")
-    patterns = [
-        {"label": "ORG", "pattern": "Digicon Inc"},
-        {"label": "ORG", "pattern": "Rotan Mosle Inc's"},
-        {"label": "ORG", "pattern": "Rotan Mosle Technology Partners Ltd"},
-    ]
-    ruler.add_patterns(patterns)
-
-    text = """
-        Digicon Inc said it has completed the previously-announced disposition
-        of its computer systems division to an investment group led by
-        Rotan Mosle Inc's Rotan Mosle Technology Partners Ltd affiliate.
-        """
-    doc = nlp(text)
-    assert len(doc.ents) == 3
-    # make it so that the third span's head is within the entity (ent_iob=I)
-    # bug #5918 would wrongly transfer that I to the full entity, resulting in 2 instead of 3 final ents.
-    # TODO: test for logging here
-    # with pytest.warns(UserWarning):
-    #     doc[29].head = doc[33]
-    doc = merge_entities(doc)
-    assert len(doc.ents) == 3
diff --git a/spacy/tests/regression/test_issue5230.py b/spacy/tests/serialize/test_resource_warning.py
similarity index 100%
rename from spacy/tests/regression/test_issue5230.py
rename to spacy/tests/serialize/test_resource_warning.py
diff --git a/spacy/training/gold_io.pyx b/spacy/training/gold_io.pyx
index 8fb6b8565..327748d01 100644
--- a/spacy/training/gold_io.pyx
+++ b/spacy/training/gold_io.pyx
@@ -20,7 +20,8 @@ def docs_to_json(docs, doc_id=0, ner_missing_tag="O"):
         docs = [docs]
     json_doc = {"id": doc_id, "paragraphs": []}
     for i, doc in enumerate(docs):
-        json_para = {'raw': doc.text, "sentences": [], "cats": [], "entities": [], "links": []}
+        raw = None if doc.has_unknown_spaces else doc.text
+        json_para = {'raw': raw, "sentences": [], "cats": [], "entities": [], "links": []}
         for cat, val in doc.cats.items():
             json_cat = {"label": cat, "value": val}
             json_para["cats"].append(json_cat)
diff --git a/spacy/training/loop.py b/spacy/training/loop.py
index c3fa83b39..eecb3e273 100644
--- a/spacy/training/loop.py
+++ b/spacy/training/loop.py
@@ -112,10 +112,10 @@ def train(
                     nlp.to_disk(final_model_path)
             else:
                 nlp.to_disk(final_model_path)
-    # This will only run if we don't hit an error
-    stdout.write(
-        msg.good("Saved pipeline to output directory", final_model_path) + "\n"
-    )
+            # This will only run if we don't hit an error
+            stdout.write(
+                msg.good("Saved pipeline to output directory", final_model_path) + "\n"
+            )
 
 
 def train_while_improving(
diff --git a/website/docs/usage/_benchmarks-models.md b/website/docs/usage/_benchmarks-models.md
index becd313f4..1e755e39d 100644
--- a/website/docs/usage/_benchmarks-models.md
+++ b/website/docs/usage/_benchmarks-models.md
@@ -1,19 +1,18 @@
 import { Help } from 'components/typography'; import Link from 'components/link'
 
-<!-- TODO: update speed and v2 NER numbers -->
-
 <figure>
 
-| Pipeline                                                   | Parser | Tagger |  NER | WPS<br />CPU <Help>words per second on CPU, higher is better</Help> | WPS<br/>GPU <Help>words per second on GPU, higher is better</Help> |
-| ---------------------------------------------------------- | -----: | -----: | ---: | ------------------------------------------------------------------: | -----------------------------------------------------------------: |
-| [`en_core_web_trf`](/models/en#en_core_web_trf) (spaCy v3) |   95.5 |   98.3 | 89.7 |                                                                  1k |                                                                 8k |
-| [`en_core_web_lg`](/models/en#en_core_web_lg) (spaCy v3)   |   92.2 |   97.4 | 85.8 |                                                                  7k |                                                                    |
-| `en_core_web_lg` (spaCy v2)                                |   91.9 |   97.2 |      |                                                                 10k |                                                                    |
+| Pipeline                                                   | Parser | Tagger |  NER |
+| ---------------------------------------------------------- | -----: | -----: | ---: |
+| [`en_core_web_trf`](/models/en#en_core_web_trf) (spaCy v3) |   95.5 |   98.3 | 89.4 |
+| [`en_core_web_lg`](/models/en#en_core_web_lg) (spaCy v3)   |   92.2 |   97.4 | 85.4 |
+| `en_core_web_lg` (spaCy v2)                                |   91.9 |   97.2 | 85.5 |
 
 <figcaption class="caption">
 
 **Full pipeline accuracy and speed** on the
-[OntoNotes 5.0](https://catalog.ldc.upenn.edu/LDC2013T19) corpus.
+[OntoNotes 5.0](https://catalog.ldc.upenn.edu/LDC2013T19) corpus (reported on
+the development set).
 
 </figcaption>
 
@@ -21,14 +20,11 @@ import { Help } from 'components/typography'; import Link from 'components/link'
 
 <figure>
 
-| Named Entity Recognition System                                                | OntoNotes | CoNLL '03 |
-| ------------------------------------------------------------------------------ | --------: | --------: |
-| spaCy RoBERTa (2020)                                                           |      89.7 |      91.6 |
-| spaCy CNN (2020)                                                               |      84.5 |           |
-| spaCy CNN (2017)                                                               |           |           |
-| [Stanza](https://stanfordnlp.github.io/stanza/) (StanfordNLP)<sup>1</sup>      |      88.8 |      92.1 |
-| <Link to="https://github.com/flairNLP/flair" hideIcon>Flair</Link><sup>2</sup> |      89.7 |      93.1 |
-| BERT Base<sup>3</sup>                                                          |         - |      92.4 |
+| Named Entity Recognition System  | OntoNotes | CoNLL '03 |
+| -------------------------------- | --------: | --------: |
+| spaCy RoBERTa (2020)             |      89.7 |      91.6 |
+| Stanza (StanfordNLP)<sup>1</sup> |      88.8 |      92.1 |
+| Flair<sup>2</sup>                |      89.7 |      93.1 |
 
 <figcaption class="caption">
 
@@ -36,9 +32,10 @@ import { Help } from 'components/typography'; import Link from 'components/link'
 [OntoNotes 5.0](https://catalog.ldc.upenn.edu/LDC2013T19) and
 [CoNLL-2003](https://www.aclweb.org/anthology/W03-0419.pdf) corpora. See
 [NLP-progress](http://nlpprogress.com/english/named_entity_recognition.html) for
-more results. **1. ** [Qi et al. (2020)](https://arxiv.org/pdf/2003.07082.pdf).
-**2. ** [Akbik et al. (2018)](https://www.aclweb.org/anthology/C18-1139/). **3.
-** [Devlin et al. (2018)](https://arxiv.org/abs/1810.04805).
+more results. Project template:
+[`benchmarks/ner_conll03`](%%GITHUB_PROJECTS/benchmarks/ner_conll03). **1. **
+[Qi et al. (2020)](https://arxiv.org/pdf/2003.07082.pdf). **2. **
+[Akbik et al. (2018)](https://www.aclweb.org/anthology/C18-1139/).
 
 </figcaption>
 
diff --git a/website/docs/usage/facts-figures.md b/website/docs/usage/facts-figures.md
index 2707f68fa..269ac5e17 100644
--- a/website/docs/usage/facts-figures.md
+++ b/website/docs/usage/facts-figures.md
@@ -10,6 +10,18 @@ menu:
 
 ## Comparison {#comparison hidden="true"}
 
+spaCy is a **free, open-source library** for advanced **Natural Language
+Processing** (NLP) in Python. It's designed specifically for **production use**
+and helps you build applications that process and "understand" large volumes of
+text. It can be used to build information extraction or natural language
+understanding systems.
+
+### Feature overview {#comparison-features}
+
+import Features from 'widgets/features.js'
+
+<Features />
+
 ### When should I use spaCy? {#comparison-usage}
 
 - ✅ **I'm a beginner and just getting started with NLP.** – spaCy makes it easy
@@ -65,8 +77,7 @@ import Benchmarks from 'usage/\_benchmarks-models.md'
 
 | Dependency Parsing System                                                      |  UAS |  LAS |
 | ------------------------------------------------------------------------------ | ---: | ---: |
-| spaCy RoBERTa (2020)<sup>1</sup>                                               | 95.5 | 94.3 |
-| spaCy CNN (2020)<sup>1</sup>                                                   |      |      |
+| spaCy RoBERTa (2020)                                                           | 95.5 | 94.3 |
 | [Mrini et al.](https://khalilmrini.github.io/Label_Attention_Layer.pdf) (2019) | 97.4 | 96.3 |
 | [Zhou and Zhao](https://www.aclweb.org/anthology/P19-1230/) (2019)             | 97.2 | 95.7 |
 
@@ -74,7 +85,7 @@ import Benchmarks from 'usage/\_benchmarks-models.md'
 
 **Dependency parsing accuracy** on the Penn Treebank. See
 [NLP-progress](http://nlpprogress.com/english/dependency_parsing.html) for more
-results. **1. ** Project template:
+results. Project template:
 [`benchmarks/parsing_penn_treebank`](%%GITHUB_PROJECTS/benchmarks/parsing_penn_treebank).
 
 </figcaption>
diff --git a/website/docs/usage/rule-based-matching.md b/website/docs/usage/rule-based-matching.md
index d1a8497d7..131bd8c94 100644
--- a/website/docs/usage/rule-based-matching.md
+++ b/website/docs/usage/rule-based-matching.md
@@ -489,11 +489,11 @@ This allows you to write callbacks that consider the entire set of matched
 phrases, so that you can resolve overlaps and other conflicts in whatever way
 you prefer.
 
-| Argument  | Description                                                                                                                                        |
-| --------- | -------------------------------------------------------------------------------------------------------------------------------------------------- |
-| `matcher` | The matcher instance. ~~Matcher~~                                                                                                                  |
-| `doc`     | The document the matcher was used on. ~~Doc~~                                                                                                      |
-| `i`       | Index of the current match (`matches[i`]). ~~int~~                                                                                                 |
+| Argument  | Description                                                                                                                                       |
+| --------- | ------------------------------------------------------------------------------------------------------------------------------------------------- |
+| `matcher` | The matcher instance. ~~Matcher~~                                                                                                                 |
+| `doc`     | The document the matcher was used on. ~~Doc~~                                                                                                     |
+| `i`       | Index of the current match (`matches[i`]). ~~int~~                                                                                                |
 | `matches` | A list of `(match_id, start, end)` tuples, describing the matches. A match tuple describes a span `doc[start:end`]. ~~List[Tuple[int, int int]]~~ |
 
 ### Creating spans from matches {#matcher-spans}
@@ -631,8 +631,8 @@ To get a quick overview of the results, you could collect all sentences
 containing a match and render them with the
 [displaCy visualizer](/usage/visualizers). In the callback function, you'll have
 access to the `start` and `end` of each match, as well as the parent `Doc`. This
-lets you determine the sentence containing the match, `doc[start:end].sent`,
-and calculate the start and end of the matched span within the sentence. Using
+lets you determine the sentence containing the match, `doc[start:end].sent`, and
+calculate the start and end of the matched span within the sentence. Using
 displaCy in ["manual" mode](/usage/visualizers#manual-usage) lets you pass in a
 list of dictionaries containing the text and entities to render.
 
diff --git a/website/docs/usage/v3.md b/website/docs/usage/v3.md
index 9191a7db2..d9d636bb1 100644
--- a/website/docs/usage/v3.md
+++ b/website/docs/usage/v3.md
@@ -77,6 +77,26 @@ import Benchmarks from 'usage/\_benchmarks-models.md'
 
 <Benchmarks />
 
+#### New trained transformer-based pipelines {#features-transformers-pipelines}
+
+> #### Notes on model capabilities
+>
+> The models are each trained with a **single transformer** shared across the
+> pipeline, which requires it to be trained on a single corpus. For
+> [English](/models/en) and [Chinese](/models/zh), we used the OntoNotes 5
+> corpus, which has annotations across several tasks. For [French](/models/fr),
+> [Spanish](/models/es) and [German](/models/de), we didn't have a suitable
+> corpus that had both syntactic and entity annotations, so the transformer
+> models for those languages do not include NER.
+
+| Package                                          | Language | Transformer                                                                                   | Tagger | Parser |  NER |
+| ------------------------------------------------ | -------- | --------------------------------------------------------------------------------------------- | -----: | -----: | ---: |
+| [`en_core_web_trf`](/models/en#en_core_web_trf)  | English  | [`roberta-base`](https://huggingface.co/roberta-base)                                         |   97.8 |   95.0 | 89.4 |
+| [`de_dep_news_trf`](/models/de#de_dep_news_trf)  | German   | [`bert-base-german-cased`](https://huggingface.co/bert-base-german-cased)                     |   99.0 |   95.8 |    - |
+| [`es_dep_news_trf`](/models/es#es_dep_news_trf)  | Spanish  | [`bert-base-spanish-wwm-cased`](https://huggingface.co/dccuchile/bert-base-spanish-wwm-cased) |   98.2 |   94.6 |    - |
+| [`fr_dep_news_trf`](/models/fr#fr_dep_news_trf)  | French   | [`camembert-base`](https://huggingface.co/camembert-base)                                     |   95.7 |   94.9 |    - |
+| [`zh_core_web_trf`](/models/zh#zh_core_news_trf) | Chinese  | [`bert-base-chinese`](https://huggingface.co/bert-base-chinese)                               |   92.5 |   77.2 | 75.6 |
+
 <Infobox title="Details & Documentation" emoji="📖" list>
 
 - **Usage:** [Embeddings & Transformers](/usage/embeddings-transformers),
@@ -88,11 +108,6 @@ import Benchmarks from 'usage/\_benchmarks-models.md'
 - **Architectures: ** [TransformerModel](/api/architectures#TransformerModel),
   [TransformerListener](/api/architectures#TransformerListener),
   [Tok2VecTransformer](/api/architectures#Tok2VecTransformer)
-- **Trained Pipelines:** [`en_core_web_trf`](/models/en#en_core_web_trf),
-  [`de_dep_news_trf`](/models/de#de_dep_news_trf),
-  [`es_dep_news_trf`](/models/es#es_dep_news_trf),
-  [`fr_dep_news_trf`](/models/fr#fr_dep_news_trf),
-  [`zh_core_web_trf`](/models/zh#zh_core_web_trf)
 - **Implementation:**
   [`spacy-transformers`](https://github.com/explosion/spacy-transformers)
 
diff --git a/website/src/widgets/features.js b/website/src/widgets/features.js
new file mode 100644
index 000000000..73863d5cc
--- /dev/null
+++ b/website/src/widgets/features.js
@@ -0,0 +1,72 @@
+import React from 'react'
+import { graphql, StaticQuery } from 'gatsby'
+
+import { Ul, Li } from '../components/list'
+
+export default () => (
+    <StaticQuery
+        query={query}
+        render={({ site }) => {
+            const { counts } = site.siteMetadata
+            return (
+                <Ul>
+                    <Li>
+                        ✅ Support for <strong>{counts.langs}+ languages</strong>
+                    </Li>
+                    <Li>
+                        ✅ <strong>{counts.models} trained pipelines</strong> for{' '}
+                        {counts.modelLangs} languages
+                    </Li>
+                    <Li>
+                        ✅ Multi-task learning with pretrained <strong>transformers</strong> like
+                        BERT
+                    </Li>
+                    <Li>
+                        ✅ Pretrained <strong>word vectors</strong>
+                    </Li>
+                    <Li>✅ State-of-the-art speed</Li>
+                    <Li>
+                        ✅ Production-ready <strong>training system</strong>
+                    </Li>
+                    <Li>
+                        ✅ Linguistically-motivated <strong>tokenization</strong>
+                    </Li>
+                    <Li>
+                        ✅ Components for <strong>named entity</strong> recognition, part-of-speech
+                        tagging, dependency parsing, sentence segmentation,{' '}
+                        <strong>text classification</strong>, lemmatization, morphological analysis,
+                        entity linking and more
+                    </Li>
+                    <Li>
+                        ✅ Easily extensible with <strong>custom components</strong> and attributes
+                    </Li>
+                    <Li>
+                        ✅ Support for custom models in <strong>PyTorch</strong>,{' '}
+                        <strong>TensorFlow</strong> and other frameworks
+                    </Li>
+                    <Li>
+                        ✅ Built in <strong>visualizers</strong> for syntax and NER
+                    </Li>
+                    <Li>
+                        ✅ Easy <strong>model packaging</strong>, deployment and workflow management
+                    </Li>
+                    <Li>✅ Robust, rigorously evaluated accuracy</Li>
+                </Ul>
+            )
+        }}
+    />
+)
+
+const query = graphql`
+    query FeaturesQuery {
+        site {
+            siteMetadata {
+                counts {
+                    langs
+                    modelLangs
+                    models
+                }
+            }
+        }
+    }
+`
diff --git a/website/src/widgets/landing.js b/website/src/widgets/landing.js
index 46be93ab5..2cee9460f 100644
--- a/website/src/widgets/landing.js
+++ b/website/src/widgets/landing.js
@@ -14,13 +14,13 @@ import {
     LandingBanner,
 } from '../components/landing'
 import { H2 } from '../components/typography'
-import { Ul, Li } from '../components/list'
 import { InlineCode } from '../components/code'
 import Button from '../components/button'
 import Link from '../components/link'
 
 import QuickstartTraining from './quickstart-training'
 import Project from './project'
+import Features from './features'
 import courseImage from '../../docs/images/course.jpg'
 import prodigyImage from '../../docs/images/prodigy_overview.jpg'
 import projectsImage from '../../docs/images/projects.png'
@@ -56,7 +56,7 @@ for entity in doc.ents:
 }
 
 const Landing = ({ data }) => {
-    const { counts, nightly } = data
+    const { nightly } = data
     const codeExample = getCodeExample(nightly)
     return (
         <>
@@ -98,51 +98,7 @@ const Landing = ({ data }) => {
 
                 <LandingCol>
                     <H2>Features</H2>
-                    <Ul>
-                        <Li>
-                            ✅ Support for <strong>{counts.langs}+ languages</strong>
-                        </Li>
-                        <Li>
-                            ✅ <strong>{counts.models} trained pipelines</strong> for{' '}
-                            {counts.modelLangs} languages
-                        </Li>
-                        <Li>
-                            ✅ Multi-task learning with pretrained <strong>transformers</strong>{' '}
-                            like BERT
-                        </Li>
-                        <Li>
-                            ✅ Pretrained <strong>word vectors</strong>
-                        </Li>
-                        <Li>✅ State-of-the-art speed</Li>
-                        <Li>
-                            ✅ Production-ready <strong>training system</strong>
-                        </Li>
-                        <Li>
-                            ✅ Linguistically-motivated <strong>tokenization</strong>
-                        </Li>
-                        <Li>
-                            ✅ Components for <strong>named entity</strong> recognition,
-                            part-of-speech tagging, dependency parsing, sentence segmentation,{' '}
-                            <strong>text classification</strong>, lemmatization, morphological
-                            analysis, entity linking and more
-                        </Li>
-                        <Li>
-                            ✅ Easily extensible with <strong>custom components</strong> and
-                            attributes
-                        </Li>
-                        <Li>
-                            ✅ Support for custom models in <strong>PyTorch</strong>,{' '}
-                            <strong>TensorFlow</strong> and other frameworks
-                        </Li>
-                        <Li>
-                            ✅ Built in <strong>visualizers</strong> for syntax and NER
-                        </Li>
-                        <Li>
-                            ✅ Easy <strong>model packaging</strong>, deployment and workflow
-                            management
-                        </Li>
-                        <Li>✅ Robust, rigorously evaluated accuracy</Li>
-                    </Ul>
+                    <Features />
                 </LandingCol>
             </LandingGrid>
 
@@ -333,11 +289,6 @@ const landingQuery = graphql`
             siteMetadata {
                 nightly
                 repo
-                counts {
-                    langs
-                    modelLangs
-                    models
-                }
             }
         }
     }