diff --git a/website/assets/img/docs/pipeline.svg b/website/assets/img/docs/pipeline.svg
index e42c2362f..2ff00d787 100644
--- a/website/assets/img/docs/pipeline.svg
+++ b/website/assets/img/docs/pipeline.svg
@@ -2,7 +2,7 @@
diff --git a/website/docs/usage/_spacy-101/_vocab-stringstore.jade b/website/docs/usage/_spacy-101/_vocab-stringstore.jade
index 3f551c9e1..dd300b5b9 100644
--- a/website/docs/usage/_spacy-101/_vocab-stringstore.jade
+++ b/website/docs/usage/_spacy-101/_vocab-stringstore.jade
@@ -89,4 +89,6 @@ p
p
| Even though both #[code Doc] objects contain the same words, the internal
- | integer IDs are very different.
+ | integer IDs are very different. The same applies for all other strings,
+ | like the annotation scheme. To avoid mismatched IDs, spaCy will always
+ | export the vocab if you save a #[code Doc] or #[code nlp] object.
diff --git a/website/docs/usage/lightning-tour.jade b/website/docs/usage/lightning-tour.jade
index 7de486070..8cf651be0 100644
--- a/website/docs/usage/lightning-tour.jade
+++ b/website/docs/usage/lightning-tour.jade
@@ -139,6 +139,8 @@ p
new_doc = Doc(Vocab()).from_disk('/moby_dick.bin')
+infobox
+ | #[strong API:] #[+api("language") #[code Language]],
+ | #[+api("doc") #[code Doc]]
| #[strong Usage:] #[+a("/docs/usage/saving-loading") Saving and loading]
+h(2, "rule-matcher") Match text with token rules
diff --git a/website/docs/usage/rule-based-matching.jade b/website/docs/usage/rule-based-matching.jade
index fde6da6ef..1fd398ad9 100644
--- a/website/docs/usage/rule-based-matching.jade
+++ b/website/docs/usage/rule-based-matching.jade
@@ -345,7 +345,7 @@ p
| account and check the #[code subtree] for intensifiers like "very", to
| increase the sentiment score. At some point, you might also want to train
| a sentiment model. However, the approach described in this example is
- | very useful for #[strong bootstrapping rules to gather training data].
+ | very useful for #[strong bootstrapping rules to collect training data].
| It's also an incredibly fast way to gather first insights into your data
| – with about 1 million tweets, you'd be looking at a processing time of
| #[strong under 1 minute].