mirror of
				https://github.com/explosion/spaCy.git
				synced 2025-10-30 15:37:29 +03:00 
			
		
		
		
	Adding a note on retrieving the string rep of the match_id (#4904)
Stolen from here: https://stackoverflow.com/questions/47638877/using-phrasematcher-in-spacy-to-find-multiple-match-types
This commit is contained in:
		
							parent
							
								
									6ff947e1f9
								
							
						
					
					
						commit
						02a44c5be2
					
				|  | @ -70,6 +70,17 @@ Find all token sequences matching the supplied patterns on the `Doc`. | |||
| | `doc`       | `Doc` | The document to match over.                                                                                                                                              | | ||||
| | **RETURNS** | list  | A list of `(match_id, start, end)` tuples, describing the matches. A match tuple describes a span `doc[start:end]`. The `match_id` is the ID of the added match pattern. | | ||||
| 
 | ||||
| <Infobox title="Note on retrieving the string representation of the match_id" variant="warning"> | ||||
| 
 | ||||
| Because spaCy stores all strings as integers, the match_id you get back will be an integer, too – but you can always get the string representation by looking it up in the vocabulary's StringStore, i.e. nlp.vocab.strings: | ||||
| 
 | ||||
| ``` | ||||
| match_id_string = nlp.vocab.strings[match_id] | ||||
| ``` | ||||
| 
 | ||||
| </Infobox> | ||||
| 
 | ||||
| 
 | ||||
| ## PhraseMatcher.pipe {#pipe tag="method"} | ||||
| 
 | ||||
| Match a stream of documents, yielding them in turn. | ||||
|  |  | |||
		Loading…
	
		Reference in New Issue
	
	Block a user