mirror of
				https://github.com/explosion/spaCy.git
				synced 2025-10-31 07:57:35 +03:00 
			
		
		
		
	
		
			
				
	
	
		
			37 lines
		
	
	
		
			2.5 KiB
		
	
	
	
		
			Plaintext
		
	
	
	
	
	
			
		
		
	
	
			37 lines
		
	
	
		
			2.5 KiB
		
	
	
	
		
			Plaintext
		
	
	
	
	
	
| A named entity is a "real-world object" that's assigned a name – for example, a
 | ||
| person, a country, a product or a book title. spaCy can **recognize various
 | ||
| types of named entities in a document, by asking the model for a prediction**.
 | ||
| Because models are statistical and strongly depend on the examples they were
 | ||
| trained on, this doesn't always work _perfectly_ and might need some tuning
 | ||
| later, depending on your use case.
 | ||
| 
 | ||
| Named entities are available as the `ents` property of a `Doc`:
 | ||
| 
 | ||
| ```python {executable="true"}
 | ||
| import spacy
 | ||
| 
 | ||
| nlp = spacy.load("en_core_web_sm")
 | ||
| doc = nlp("Apple is looking at buying U.K. startup for $1 billion")
 | ||
| 
 | ||
| for ent in doc.ents:
 | ||
|     print(ent.text, ent.start_char, ent.end_char, ent.label_)
 | ||
| ```
 | ||
| 
 | ||
| > - **Text:** The original entity text.
 | ||
| > - **Start:** Index of start of entity in the `Doc`.
 | ||
| > - **End:** Index of end of entity in the `Doc`.
 | ||
| > - **Label:** Entity label, i.e. type.
 | ||
| 
 | ||
| | Text        | Start | End | Label   | Description                                          |
 | ||
| | ----------- | :---: | :-: | ------- | ---------------------------------------------------- |
 | ||
| | Apple       |   0   |  5  | `ORG`   | Companies, agencies, institutions.                   |
 | ||
| | U.K.        |  27   | 31  | `GPE`   | Geopolitical entity, i.e. countries, cities, states. |
 | ||
| | \$1 billion |  44   | 54  | `MONEY` | Monetary values, including unit.                     |
 | ||
| 
 | ||
| Using spaCy's built-in [displaCy visualizer](/usage/visualizers), here's what
 | ||
| our example sentence and its named entities look like:
 | ||
| 
 | ||
| <Standalone height={120}>
 | ||
| <div style={{lineHeight: 2.5, fontFamily: "-apple-system, BlinkMacSystemFont, 'Segoe UI', Helvetica, Arial, sans-serif, 'Apple Color Emoji', 'Segoe UI Emoji', 'Segoe UI Symbol'", fontSize: 18}}><mark style={{ background: '#7aecec', padding: '0.45em 0.6em', margin: '0 0.25em', lineHeight: 1, borderRadius: '0.35em'}}>Apple <span style={{ fontSize: '0.8em', fontWeight: 'bold', lineHeight: 1, borderRadius: '0.35em', marginLeft: '0.5rem'}}>ORG</span></mark> is looking at buying <mark style={{ background: '#feca74', padding: '0.45em 0.6em', margin: '0 0.25em', lineHeight: 1, borderRadius: '0.35em'}}>U.K. <span style={{ fontSize: '0.8em', fontWeight: 'bold', lineHeight: 1, borderRadius: '0.35em', marginLeft: '0.5rem'}}>GPE</span></mark> startup for <mark style={{ background: '#e4e7d2', padding: '0.45em 0.6em', margin: '0 0.25em', lineHeight: 1, borderRadius: '0.35em'}}>$1 billion <span style={{ fontSize: '0.8em', fontWeight: 'bold', lineHeight: 1, borderRadius: '0.35em', marginLeft: '0.5rem'}}>MONEY</span></mark></div>
 | ||
| </Standalone>
 |