From 02a44c5be2dbbd8a6a0a1d40ecb04bc887ce8fb1 Mon Sep 17 00:00:00 2001
From: "Martin A. Kayser" <9056896+maknotavailable@users.noreply.github.com>
Date: Mon, 3 Feb 2020 03:58:59 -0800
Subject: [PATCH] Adding a note on retrieving the string rep of the match_id
 (#4904)

Stolen from here: https://stackoverflow.com/questions/47638877/using-phrasematcher-in-spacy-to-find-multiple-match-types
---
 website/docs/api/phrasematcher.md | 11 +++++++++++
 1 file changed, 11 insertions(+)
diff --git a/website/docs/api/phrasematcher.md b/website/docs/api/phrasematcher.md
index 90ecd3416..4119c8fc0 100644
--- a/website/docs/api/phrasematcher.md
+++ b/website/docs/api/phrasematcher.md
@@ -70,6 +70,17 @@ Find all token sequences matching the supplied patterns on the `Doc`.
 | `doc`       | `Doc` | The document to match over.                                                                                                                                              |
 | **RETURNS** | list  | A list of `(match_id, start, end)` tuples, describing the matches. A match tuple describes a span `doc[start:end]`. The `match_id` is the ID of the added match pattern. |
 
+<Infobox title="Note on retrieving the string representation of the match_id" variant="warning">
+
+Because spaCy stores all strings as integers, the match_id you get back will be an integer, too – but you can always get the string representation by looking it up in the vocabulary's StringStore, i.e. nlp.vocab.strings:
+
+```
+match_id_string = nlp.vocab.strings[match_id]
+```
+
+</Infobox>
+
+
 ## PhraseMatcher.pipe {#pipe tag="method"}
 
 Match a stream of documents, yielding them in turn.