mirror of
				https://github.com/explosion/spaCy.git
				synced 2025-11-01 00:17:44 +03:00 
			
		
		
		
	* Integrate Python kernel via Binder * Add live model test for languages with examples * Update docs and code examples * Adjust margin (if not bootstrapped) * Add binder version to global config * Update terminal and executable code mixins * Pass attributes through infobox and section * Hide v-cloak * Fix example * Take out model comparison for now * Add meta text for compat * Remove chart.js dependency * Tidy up and simplify JS and port big components over to Vue * Remove chartjs example * Add Twitter icon * Add purple stylesheet option * Add utility for hand cursor (special cases only) * Add transition classes * Add small option for section * Add thumb object for small round thumbnail images * Allow unset code block language via "none" value (workaround to still allow unset language to default to DEFAULT_SYNTAX) * Pass through attributes * Add syntax highlighting definitions for Julia, R and Docker * Add website icon * Remove user survey from navigation * Don't hide GitHub icon on small screens * Make top navigation scrollable on small screens * Remove old resources page and references to it * Add Universe * Add helper functions for better page URL and title * Update site description * Increment versions * Update preview images * Update mentions of resources * Fix image * Fix social images * Fix problem with cover sizing and floats * Add divider and move badges into heading * Add docstrings * Reference converting section * Add section on converting word vectors * Move converting section to custom section and fix formatting * Remove old fastText example * Move extensions content to own section Keep weird ID to not break permalinks for now (we don't want to rewrite URLs if not absolutely necessary) * Use better component example and add factories section * Add note on larger model * Use better example for non-vector * Remove similarity in context section Only works via small models with tensors so has always been kind of confusing * Add note on init-model command * Fix lightning tour examples and make excutable if possible * Add spacy train CLI section to train * Fix formatting and add video * Fix formatting * Fix textcat example description (resolves #2246) * Add dummy file to try resolve conflict * Delete dummy file * Tidy up [ci skip] * Ensure sufficient height of loading container * Add loading animation to universe * Update Thebelab build and use better startup message * Fix asset versioning * Fix typo [ci skip] * Add note on project idea label
		
			
				
	
	
		
			42 lines
		
	
	
		
			1.7 KiB
		
	
	
	
		
			Plaintext
		
	
	
	
	
	
			
		
		
	
	
			42 lines
		
	
	
		
			1.7 KiB
		
	
	
	
		
			Plaintext
		
	
	
	
	
	
| //- 💫 DOCS > USAGE > SPACY 101 > NAMED ENTITIES
 | ||
| 
 | ||
| p
 | ||
|     |  A named entity is a "real-world object" that's assigned a name – for
 | ||
|     |  example, a person, a country, a product or a book title. spaCy can
 | ||
|     |  #[strong recognise] #[+a("/api/annotation#named-entities") various types]
 | ||
|     |  of named entities in a document, by asking the model for a
 | ||
|     |  #[strong prediction]. Because models are statistical and strongly depend
 | ||
|     |  on the examples they were trained on, this doesn't always work
 | ||
|     |  #[em perfectly] and might need some tuning later, depending on your use
 | ||
|     |  case.
 | ||
| 
 | ||
| p
 | ||
|     |  Named entities are available as the #[code ents] property of a #[code Doc]:
 | ||
| 
 | ||
| +code-exec.
 | ||
|     import spacy
 | ||
| 
 | ||
|     nlp = spacy.load('en_core_web_sm')
 | ||
|     doc = nlp(u'Apple is looking at buying U.K. startup for $1 billion')
 | ||
| 
 | ||
|     for ent in doc.ents:
 | ||
|         print(ent.text, ent.start_char, ent.end_char, ent.label_)
 | ||
| 
 | ||
| +aside
 | ||
|     |  #[strong Text]: The original entity text.#[br]
 | ||
|     |  #[strong Start]: Index of start of entity in the #[code Doc].#[br]
 | ||
|     |  #[strong End]: Index of end of entity in the #[code Doc].#[br]
 | ||
|     |  #[strong Label]: Entity label, i.e. type.
 | ||
| 
 | ||
| +table(["Text", "Start", "End", "Label", "Description"])
 | ||
|     - var style = [0, 1, 1, 1, 0]
 | ||
|     +annotation-row(["Apple", 0, 5, "ORG", "Companies, agencies, institutions."], style)
 | ||
|     +annotation-row(["U.K.", 27, 31, "GPE", "Geopolitical entity, i.e. countries, cities, states."], style)
 | ||
|     +annotation-row(["$1 billion", 44, 54, "MONEY", "Monetary values, including unit."], style)
 | ||
| 
 | ||
| p
 | ||
|     |  Using spaCy's built-in #[+a("/usage/visualizers") displaCy visualizer],
 | ||
|     |  here's what our example sentence and its named entities look like:
 | ||
| 
 | ||
| +codepen("2f2ad1408ff79fc6a326ea3aedbb353b", 160)
 |