mirror of
				https://github.com/explosion/spaCy.git
				synced 2025-10-26 05:31:15 +03:00 
			
		
		
		
	* Make Span.char_span optional args keyword-only * Make kb_id and following kw-only * Format
		
			
				
	
	
		
			572 lines
		
	
	
		
			27 KiB
		
	
	
	
		
			Plaintext
		
	
	
	
	
	
			
		
		
	
	
			572 lines
		
	
	
		
			27 KiB
		
	
	
	
		
			Plaintext
		
	
	
	
	
	
| ---
 | ||
| title: Span
 | ||
| tag: class
 | ||
| source: spacy/tokens/span.pyx
 | ||
| ---
 | ||
| 
 | ||
| A slice from a [`Doc`](/api/doc) object.
 | ||
| 
 | ||
| ## Span.\_\_init\_\_ {id="init",tag="method"}
 | ||
| 
 | ||
| Create a `Span` object from the slice `doc[start : end]`.
 | ||
| 
 | ||
| > #### Example
 | ||
| >
 | ||
| > ```python
 | ||
| > doc = nlp("Give it back! He pleaded.")
 | ||
| > span = doc[1:4]
 | ||
| > assert [t.text for t in span] ==  ["it", "back", "!"]
 | ||
| > ```
 | ||
| 
 | ||
| | Name          | Description                                                                             |
 | ||
| | ------------- | --------------------------------------------------------------------------------------- |
 | ||
| | `doc`         | The parent document. ~~Doc~~                                                            |
 | ||
| | `start`       | The index of the first token of the span. ~~int~~                                       |
 | ||
| | `end`         | The index of the first token after the span. ~~int~~                                    |
 | ||
| | `label`       | A label to attach to the span, e.g. for named entities. ~~Union[str, int]~~             |
 | ||
| | `vector`      | A meaning representation of the span. ~~numpy.ndarray[ndim=1, dtype=float32]~~          |
 | ||
| | `vector_norm` | The L2 norm of the document's vector representation. ~~float~~                          |
 | ||
| | `kb_id`       | A knowledge base ID to attach to the span, e.g. for named entities. ~~Union[str, int]~~ |
 | ||
| | `span_id`     | An ID to associate with the span. ~~Union[str, int]~~                                   |
 | ||
| 
 | ||
| ## Span.\_\_getitem\_\_ {id="getitem",tag="method"}
 | ||
| 
 | ||
| Get a `Token` object.
 | ||
| 
 | ||
| > #### Example
 | ||
| >
 | ||
| > ```python
 | ||
| > doc = nlp("Give it back! He pleaded.")
 | ||
| > span = doc[1:4]
 | ||
| > assert span[1].text == "back"
 | ||
| > ```
 | ||
| 
 | ||
| | Name        | Description                                     |
 | ||
| | ----------- | ----------------------------------------------- |
 | ||
| | `i`         | The index of the token within the span. ~~int~~ |
 | ||
| | **RETURNS** | The token at `span[i]`. ~~Token~~               |
 | ||
| 
 | ||
| Get a `Span` object.
 | ||
| 
 | ||
| > #### Example
 | ||
| >
 | ||
| > ```python
 | ||
| > doc = nlp("Give it back! He pleaded.")
 | ||
| > span = doc[1:4]
 | ||
| > assert span[1:3].text == "back!"
 | ||
| > ```
 | ||
| 
 | ||
| | Name        | Description                                       |
 | ||
| | ----------- | ------------------------------------------------- |
 | ||
| | `start_end` | The slice of the span to get. ~~Tuple[int, int]~~ |
 | ||
| | **RETURNS** | The span at `span[start : end]`. ~~Span~~         |
 | ||
| 
 | ||
| ## Span.\_\_iter\_\_ {id="iter",tag="method"}
 | ||
| 
 | ||
| Iterate over `Token` objects.
 | ||
| 
 | ||
| > #### Example
 | ||
| >
 | ||
| > ```python
 | ||
| > doc = nlp("Give it back! He pleaded.")
 | ||
| > span = doc[1:4]
 | ||
| > assert [t.text for t in span] == ["it", "back", "!"]
 | ||
| > ```
 | ||
| 
 | ||
| | Name       | Description                 |
 | ||
| | ---------- | --------------------------- |
 | ||
| | **YIELDS** | A `Token` object. ~~Token~~ |
 | ||
| 
 | ||
| ## Span.\_\_len\_\_ {id="len",tag="method"}
 | ||
| 
 | ||
| Get the number of tokens in the span.
 | ||
| 
 | ||
| > #### Example
 | ||
| >
 | ||
| > ```python
 | ||
| > doc = nlp("Give it back! He pleaded.")
 | ||
| > span = doc[1:4]
 | ||
| > assert len(span) == 3
 | ||
| > ```
 | ||
| 
 | ||
| | Name        | Description                               |
 | ||
| | ----------- | ----------------------------------------- |
 | ||
| | **RETURNS** | The number of tokens in the span. ~~int~~ |
 | ||
| 
 | ||
| ## Span.set_extension {id="set_extension",tag="classmethod",version="2"}
 | ||
| 
 | ||
| Define a custom attribute on the `Span` which becomes available via `Span._`.
 | ||
| For details, see the documentation on
 | ||
| [custom attributes](/usage/processing-pipelines#custom-components-attributes).
 | ||
| 
 | ||
| > #### Example
 | ||
| >
 | ||
| > ```python
 | ||
| > from spacy.tokens import Span
 | ||
| > city_getter = lambda span: any(city in span.text for city in ("New York", "Paris", "Berlin"))
 | ||
| > Span.set_extension("has_city", getter=city_getter)
 | ||
| > doc = nlp("I like New York in Autumn")
 | ||
| > assert doc[1:4]._.has_city
 | ||
| > ```
 | ||
| 
 | ||
| | Name      | Description                                                                                                                                                                     |
 | ||
| | --------- | ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
 | ||
| | `name`    | Name of the attribute to set by the extension. For example, `"my_attr"` will be available as `span._.my_attr`. ~~str~~                                                          |
 | ||
| | `default` | Optional default value of the attribute if no getter or method is defined. ~~Optional[Any]~~                                                                                    |
 | ||
| | `method`  | Set a custom method on the object, for example `span._.compare(other_span)`. ~~Optional[Callable[[Span, ...], Any]]~~                                                           |
 | ||
| | `getter`  | Getter function that takes the object and returns an attribute value. Is called when the user accesses the `._` attribute. ~~Optional[Callable[[Span], Any]]~~                  |
 | ||
| | `setter`  | Setter function that takes the `Span` and a value, and modifies the object. Is called when the user writes to the `Span._` attribute. ~~Optional[Callable[[Span, Any], None]]~~ |
 | ||
| | `force`   | Force overwriting existing attribute. ~~bool~~                                                                                                                                  |
 | ||
| 
 | ||
| ## Span.get_extension {id="get_extension",tag="classmethod",version="2"}
 | ||
| 
 | ||
| Look up a previously registered extension by name. Returns a 4-tuple
 | ||
| `(default, method, getter, setter)` if the extension is registered. Raises a
 | ||
| `KeyError` otherwise.
 | ||
| 
 | ||
| > #### Example
 | ||
| >
 | ||
| > ```python
 | ||
| > from spacy.tokens import Span
 | ||
| > Span.set_extension("is_city", default=False)
 | ||
| > extension = Span.get_extension("is_city")
 | ||
| > assert extension == (False, None, None, None)
 | ||
| > ```
 | ||
| 
 | ||
| | Name        | Description                                                                                                                                        |
 | ||
| | ----------- | -------------------------------------------------------------------------------------------------------------------------------------------------- |
 | ||
| | `name`      | Name of the extension. ~~str~~                                                                                                                     |
 | ||
| | **RETURNS** | A `(default, method, getter, setter)` tuple of the extension. ~~Tuple[Optional[Any], Optional[Callable], Optional[Callable], Optional[Callable]]~~ |
 | ||
| 
 | ||
| ## Span.has_extension {id="has_extension",tag="classmethod",version="2"}
 | ||
| 
 | ||
| Check whether an extension has been registered on the `Span` class.
 | ||
| 
 | ||
| > #### Example
 | ||
| >
 | ||
| > ```python
 | ||
| > from spacy.tokens import Span
 | ||
| > Span.set_extension("is_city", default=False)
 | ||
| > assert Span.has_extension("is_city")
 | ||
| > ```
 | ||
| 
 | ||
| | Name        | Description                                         |
 | ||
| | ----------- | --------------------------------------------------- |
 | ||
| | `name`      | Name of the extension to check. ~~str~~             |
 | ||
| | **RETURNS** | Whether the extension has been registered. ~~bool~~ |
 | ||
| 
 | ||
| ## Span.remove_extension {id="remove_extension",tag="classmethod",version="2.0.12"}
 | ||
| 
 | ||
| Remove a previously registered extension.
 | ||
| 
 | ||
| > #### Example
 | ||
| >
 | ||
| > ```python
 | ||
| > from spacy.tokens import Span
 | ||
| > Span.set_extension("is_city", default=False)
 | ||
| > removed = Span.remove_extension("is_city")
 | ||
| > assert not Span.has_extension("is_city")
 | ||
| > ```
 | ||
| 
 | ||
| | Name        | Description                                                                                                                                                |
 | ||
| | ----------- | ---------------------------------------------------------------------------------------------------------------------------------------------------------- |
 | ||
| | `name`      | Name of the extension. ~~str~~                                                                                                                             |
 | ||
| | **RETURNS** | A `(default, method, getter, setter)` tuple of the removed extension. ~~Tuple[Optional[Any], Optional[Callable], Optional[Callable], Optional[Callable]]~~ |
 | ||
| 
 | ||
| ## Span.char_span {id="char_span",tag="method",version="2.2.4"}
 | ||
| 
 | ||
| Create a `Span` object from the slice `span.text[start:end]`. Returns `None` if
 | ||
| the character indices don't map to a valid span.
 | ||
| 
 | ||
| > #### Example
 | ||
| >
 | ||
| > ```python
 | ||
| > doc = nlp("I like New York")
 | ||
| > span = doc[1:4].char_span(5, 13, label="GPE")
 | ||
| > assert span.text == "New York"
 | ||
| > ```
 | ||
| 
 | ||
| | Name                                            | Description                                                                                                                                                                                                                                                                  |
 | ||
| | ----------------------------------------------- | ---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
 | ||
| | `start_idx`                                     | The index of the first character of the span. ~~int~~                                                                                                                                                                                                                        |
 | ||
| | `end_idx`                                       | The index of the last character after the span. ~~int~~                                                                                                                                                                                                                      |
 | ||
| | `label`                                         | A label to attach to the span, e.g. for named entities. ~~Union[int, str]~~                                                                                                                                                                                                  |
 | ||
| | _keyword-only_                                  |                                                                                                                                                                                                                                                                              |
 | ||
| | `kb_id`                                         | An ID from a knowledge base to capture the meaning of a named entity. ~~Union[int, str]~~                                                                                                                                                                                    |
 | ||
| | `vector`                                        | A meaning representation of the span. ~~numpy.ndarray[ndim=1, dtype=float32]~~                                                                                                                                                                                               |
 | ||
| | `alignment_mode` <Tag variant="new">3.5.1</Tag> | How character indices snap to token boundaries. Options: `"strict"` (no snapping), `"contract"` (span of all tokens completely within the character span), `"expand"` (span of all tokens at least partially covered by the character span). Defaults to `"strict"`. ~~str~~ |
 | ||
| | `span_id` <Tag variant="new">3.5.1</Tag>        | An identifier to associate with the span. ~~Union[int, str]~~                                                                                                                                                                                                                |
 | ||
| | **RETURNS**                                     | The newly constructed object or `None`. ~~Optional[Span]~~                                                                                                                                                                                                                   |
 | ||
| 
 | ||
| ## Span.similarity {id="similarity",tag="method",model="vectors"}
 | ||
| 
 | ||
| Make a semantic similarity estimate. The default estimate is cosine similarity
 | ||
| using an average of word vectors.
 | ||
| 
 | ||
| > #### Example
 | ||
| >
 | ||
| > ```python
 | ||
| > doc = nlp("green apples and red oranges")
 | ||
| > green_apples = doc[:2]
 | ||
| > red_oranges = doc[3:]
 | ||
| > apples_oranges = green_apples.similarity(red_oranges)
 | ||
| > oranges_apples = red_oranges.similarity(green_apples)
 | ||
| > assert apples_oranges == oranges_apples
 | ||
| > ```
 | ||
| 
 | ||
| | Name        | Description                                                                                                                      |
 | ||
| | ----------- | -------------------------------------------------------------------------------------------------------------------------------- |
 | ||
| | `other`     | The object to compare with. By default, accepts `Doc`, `Span`, `Token` and `Lexeme` objects. ~~Union[Doc, Span, Token, Lexeme]~~ |
 | ||
| | **RETURNS** | A scalar similarity score. Higher is more similar. ~~float~~                                                                     |
 | ||
| 
 | ||
| ## Span.get_lca_matrix {id="get_lca_matrix",tag="method"}
 | ||
| 
 | ||
| Calculates the lowest common ancestor matrix for a given `Span`. Returns LCA
 | ||
| matrix containing the integer index of the ancestor, or `-1` if no common
 | ||
| ancestor is found, e.g. if span excludes a necessary ancestor.
 | ||
| 
 | ||
| > #### Example
 | ||
| >
 | ||
| > ```python
 | ||
| > doc = nlp("I like New York in Autumn")
 | ||
| > span = doc[1:4]
 | ||
| > matrix = span.get_lca_matrix()
 | ||
| > # array([[0, 0, 0], [0, 1, 2], [0, 2, 2]], dtype=int32)
 | ||
| > ```
 | ||
| 
 | ||
| | Name        | Description                                                                             |
 | ||
| | ----------- | --------------------------------------------------------------------------------------- |
 | ||
| | **RETURNS** | The lowest common ancestor matrix of the `Span`. ~~numpy.ndarray[ndim=2, dtype=int32]~~ |
 | ||
| 
 | ||
| ## Span.to_array {id="to_array",tag="method",version="2"}
 | ||
| 
 | ||
| Given a list of `M` attribute IDs, export the tokens to a numpy `ndarray` of
 | ||
| shape `(N, M)`, where `N` is the length of the document. The values will be
 | ||
| 32-bit integers.
 | ||
| 
 | ||
| > #### Example
 | ||
| >
 | ||
| > ```python
 | ||
| > from spacy.attrs import LOWER, POS, ENT_TYPE, IS_ALPHA
 | ||
| > doc = nlp("I like New York in Autumn.")
 | ||
| > span = doc[2:3]
 | ||
| > # All strings mapped to integers, for easy export to numpy
 | ||
| > np_array = span.to_array([LOWER, POS, ENT_TYPE, IS_ALPHA])
 | ||
| > ```
 | ||
| 
 | ||
| | Name        | Description                                                                                                                              |
 | ||
| | ----------- | ---------------------------------------------------------------------------------------------------------------------------------------- |
 | ||
| | `attr_ids`  | A list of attributes (int IDs or string names) or a single attribute (int ID or string name). ~~Union[int, str, List[Union[int, str]]]~~ |
 | ||
| | **RETURNS** | The exported attributes as a numpy array. ~~Union[numpy.ndarray[ndim=2, dtype=uint64], numpy.ndarray[ndim=1, dtype=uint64]]~~            |
 | ||
| 
 | ||
| ## Span.ents {id="ents",tag="property",version="2.0.13",model="ner"}
 | ||
| 
 | ||
| The named entities that fall completely within the span. Returns a tuple of
 | ||
| `Span` objects.
 | ||
| 
 | ||
| > #### Example
 | ||
| >
 | ||
| > ```python
 | ||
| > doc = nlp("Mr. Best flew to New York on Saturday morning.")
 | ||
| > span = doc[0:6]
 | ||
| > ents = list(span.ents)
 | ||
| > assert ents[0].label == 346
 | ||
| > assert ents[0].label_ == "PERSON"
 | ||
| > assert ents[0].text == "Mr. Best"
 | ||
| > ```
 | ||
| 
 | ||
| | Name        | Description                                                       |
 | ||
| | ----------- | ----------------------------------------------------------------- |
 | ||
| | **RETURNS** | Entities in the span, one `Span` per entity. ~~Tuple[Span, ...]~~ |
 | ||
| 
 | ||
| ## Span.noun_chunks {id="noun_chunks",tag="property",model="parser"}
 | ||
| 
 | ||
| Iterate over the base noun phrases in the span. Yields base noun-phrase `Span`
 | ||
| objects, if the document has been syntactically parsed. A base noun phrase, or
 | ||
| "NP chunk", is a noun phrase that does not permit other NPs to be nested within
 | ||
| it – so no NP-level coordination, no prepositional phrases, and no relative
 | ||
| clauses.
 | ||
| 
 | ||
| If the `noun_chunk` [syntax iterator](/usage/linguistic-features#language-data)
 | ||
| has not been implemeted for the given language, a `NotImplementedError` is
 | ||
| raised.
 | ||
| 
 | ||
| > #### Example
 | ||
| >
 | ||
| > ```python
 | ||
| > doc = nlp("A phrase with another phrase occurs.")
 | ||
| > span = doc[3:5]
 | ||
| > chunks = list(span.noun_chunks)
 | ||
| > assert len(chunks) == 1
 | ||
| > assert chunks[0].text == "another phrase"
 | ||
| > ```
 | ||
| 
 | ||
| | Name       | Description                       |
 | ||
| | ---------- | --------------------------------- |
 | ||
| | **YIELDS** | Noun chunks in the span. ~~Span~~ |
 | ||
| 
 | ||
| ## Span.as_doc {id="as_doc",tag="method"}
 | ||
| 
 | ||
| Create a new `Doc` object corresponding to the `Span`, with a copy of the data.
 | ||
| 
 | ||
| When calling this on many spans from the same doc, passing in a precomputed
 | ||
| array representation of the doc using the `array_head` and `array` args can save
 | ||
| time.
 | ||
| 
 | ||
| > #### Example
 | ||
| >
 | ||
| > ```python
 | ||
| > doc = nlp("I like New York in Autumn.")
 | ||
| > span = doc[2:4]
 | ||
| > doc2 = span.as_doc()
 | ||
| > assert doc2.text == "New York"
 | ||
| > ```
 | ||
| 
 | ||
| | Name             | Description                                                                                                          |
 | ||
| | ---------------- | -------------------------------------------------------------------------------------------------------------------- |
 | ||
| | `copy_user_data` | Whether or not to copy the original doc's user data. ~~bool~~                                                        |
 | ||
| | `array_head`     | Precomputed array attributes (headers) of the original doc, as generated by `Doc._get_array_attrs()`. ~~Tuple~~      |
 | ||
| | `array`          | Precomputed array version of the original doc as generated by [`Doc.to_array`](/api/doc#to_array). ~~numpy.ndarray~~ |
 | ||
| | **RETURNS**      | A `Doc` object of the `Span`'s content. ~~Doc~~                                                                      |
 | ||
| 
 | ||
| ## Span.root {id="root",tag="property",model="parser"}
 | ||
| 
 | ||
| The token with the shortest path to the root of the sentence (or the root
 | ||
| itself). If multiple tokens are equally high in the tree, the first token is
 | ||
| taken.
 | ||
| 
 | ||
| > #### Example
 | ||
| >
 | ||
| > ```python
 | ||
| > doc = nlp("I like New York in Autumn.")
 | ||
| > i, like, new, york, in_, autumn, dot = range(len(doc))
 | ||
| > assert doc[new].head.text == "York"
 | ||
| > assert doc[york].head.text == "like"
 | ||
| > new_york = doc[new:york+1]
 | ||
| > assert new_york.root.text == "York"
 | ||
| > ```
 | ||
| 
 | ||
| | Name        | Description               |
 | ||
| | ----------- | ------------------------- |
 | ||
| | **RETURNS** | The root token. ~~Token~~ |
 | ||
| 
 | ||
| ## Span.conjuncts {id="conjuncts",tag="property",model="parser"}
 | ||
| 
 | ||
| A tuple of tokens coordinated to `span.root`.
 | ||
| 
 | ||
| > #### Example
 | ||
| >
 | ||
| > ```python
 | ||
| > doc = nlp("I like apples and oranges")
 | ||
| > apples_conjuncts = doc[2:3].conjuncts
 | ||
| > assert [t.text for t in apples_conjuncts] == ["oranges"]
 | ||
| > ```
 | ||
| 
 | ||
| | Name        | Description                                   |
 | ||
| | ----------- | --------------------------------------------- |
 | ||
| | **RETURNS** | The coordinated tokens. ~~Tuple[Token, ...]~~ |
 | ||
| 
 | ||
| ## Span.lefts {id="lefts",tag="property",model="parser"}
 | ||
| 
 | ||
| Tokens that are to the left of the span, whose heads are within the span.
 | ||
| 
 | ||
| > #### Example
 | ||
| >
 | ||
| > ```python
 | ||
| > doc = nlp("I like New York in Autumn.")
 | ||
| > lefts = [t.text for t in doc[3:7].lefts]
 | ||
| > assert lefts == ["New"]
 | ||
| > ```
 | ||
| 
 | ||
| | Name       | Description                                    |
 | ||
| | ---------- | ---------------------------------------------- |
 | ||
| | **YIELDS** | A left-child of a token of the span. ~~Token~~ |
 | ||
| 
 | ||
| ## Span.rights {id="rights",tag="property",model="parser"}
 | ||
| 
 | ||
| Tokens that are to the right of the span, whose heads are within the span.
 | ||
| 
 | ||
| > #### Example
 | ||
| >
 | ||
| > ```python
 | ||
| > doc = nlp("I like New York in Autumn.")
 | ||
| > rights = [t.text for t in doc[2:4].rights]
 | ||
| > assert rights == ["in"]
 | ||
| > ```
 | ||
| 
 | ||
| | Name       | Description                                     |
 | ||
| | ---------- | ----------------------------------------------- |
 | ||
| | **YIELDS** | A right-child of a token of the span. ~~Token~~ |
 | ||
| 
 | ||
| ## Span.n_lefts {id="n_lefts",tag="property",model="parser"}
 | ||
| 
 | ||
| The number of tokens that are to the left of the span, whose heads are within
 | ||
| the span.
 | ||
| 
 | ||
| > #### Example
 | ||
| >
 | ||
| > ```python
 | ||
| > doc = nlp("I like New York in Autumn.")
 | ||
| > assert doc[3:7].n_lefts == 1
 | ||
| > ```
 | ||
| 
 | ||
| | Name        | Description                              |
 | ||
| | ----------- | ---------------------------------------- |
 | ||
| | **RETURNS** | The number of left-child tokens. ~~int~~ |
 | ||
| 
 | ||
| ## Span.n_rights {id="n_rights",tag="property",model="parser"}
 | ||
| 
 | ||
| The number of tokens that are to the right of the span, whose heads are within
 | ||
| the span.
 | ||
| 
 | ||
| > #### Example
 | ||
| >
 | ||
| > ```python
 | ||
| > doc = nlp("I like New York in Autumn.")
 | ||
| > assert doc[2:4].n_rights == 1
 | ||
| > ```
 | ||
| 
 | ||
| | Name        | Description                               |
 | ||
| | ----------- | ----------------------------------------- |
 | ||
| | **RETURNS** | The number of right-child tokens. ~~int~~ |
 | ||
| 
 | ||
| ## Span.subtree {id="subtree",tag="property",model="parser"}
 | ||
| 
 | ||
| Tokens within the span and tokens which descend from them.
 | ||
| 
 | ||
| > #### Example
 | ||
| >
 | ||
| > ```python
 | ||
| > doc = nlp("Give it back! He pleaded.")
 | ||
| > subtree = [t.text for t in doc[:3].subtree]
 | ||
| > assert subtree == ["Give", "it", "back", "!"]
 | ||
| > ```
 | ||
| 
 | ||
| | Name       | Description                                                 |
 | ||
| | ---------- | ----------------------------------------------------------- |
 | ||
| | **YIELDS** | A token within the span, or a descendant from it. ~~Token~~ |
 | ||
| 
 | ||
| ## Span.has_vector {id="has_vector",tag="property",model="vectors"}
 | ||
| 
 | ||
| A boolean value indicating whether a word vector is associated with the object.
 | ||
| 
 | ||
| > #### Example
 | ||
| >
 | ||
| > ```python
 | ||
| > doc = nlp("I like apples")
 | ||
| > assert doc[1:].has_vector
 | ||
| > ```
 | ||
| 
 | ||
| | Name        | Description                                           |
 | ||
| | ----------- | ----------------------------------------------------- |
 | ||
| | **RETURNS** | Whether the span has a vector data attached. ~~bool~~ |
 | ||
| 
 | ||
| ## Span.vector {id="vector",tag="property",model="vectors"}
 | ||
| 
 | ||
| A real-valued meaning representation. Defaults to an average of the token
 | ||
| vectors.
 | ||
| 
 | ||
| > #### Example
 | ||
| >
 | ||
| > ```python
 | ||
| > doc = nlp("I like apples")
 | ||
| > assert doc[1:].vector.dtype == "float32"
 | ||
| > assert doc[1:].vector.shape == (300,)
 | ||
| > ```
 | ||
| 
 | ||
| | Name        | Description                                                                                     |
 | ||
| | ----------- | ----------------------------------------------------------------------------------------------- |
 | ||
| | **RETURNS** | A 1-dimensional array representing the span's vector. ~~`numpy.ndarray[ndim=1, dtype=float32]~~ |
 | ||
| 
 | ||
| ## Span.vector_norm {id="vector_norm",tag="property",model="vectors"}
 | ||
| 
 | ||
| The L2 norm of the span's vector representation.
 | ||
| 
 | ||
| > #### Example
 | ||
| >
 | ||
| > ```python
 | ||
| > doc = nlp("I like apples")
 | ||
| > doc[1:].vector_norm # 4.800883928527915
 | ||
| > doc[2:].vector_norm # 6.895897646384268
 | ||
| > assert doc[1:].vector_norm != doc[2:].vector_norm
 | ||
| > ```
 | ||
| 
 | ||
| | Name        | Description                                         |
 | ||
| | ----------- | --------------------------------------------------- |
 | ||
| | **RETURNS** | The L2 norm of the vector representation. ~~float~~ |
 | ||
| 
 | ||
| ## Span.sent {id="sent",tag="property",model="sentences"}
 | ||
| 
 | ||
| The sentence span that this span is a part of. This property is only available
 | ||
| when [sentence boundaries](/usage/linguistic-features#sbd) have been set on the
 | ||
| document by the `parser`, `senter`, `sentencizer` or some custom function. It
 | ||
| will raise an error otherwise.
 | ||
| 
 | ||
| If the span happens to cross sentence boundaries, only the first sentence will
 | ||
| be returned. If it is required that the sentence always includes the full span,
 | ||
| the result can be adjusted as such:
 | ||
| 
 | ||
| ```python
 | ||
| sent = span.sent
 | ||
| sent = doc[sent.start : max(sent.end, span.end)]
 | ||
| ```
 | ||
| 
 | ||
| > #### Example
 | ||
| >
 | ||
| > ```python
 | ||
| > doc = nlp("Give it back! He pleaded.")
 | ||
| > span = doc[1:3]
 | ||
| > assert span.sent.text == "Give it back!"
 | ||
| > ```
 | ||
| 
 | ||
| | Name        | Description                                             |
 | ||
| | ----------- | ------------------------------------------------------- |
 | ||
| | **RETURNS** | The sentence span that this span is a part of. ~~Span~~ |
 | ||
| 
 | ||
| ## Span.sents {id="sents",tag="property",model="sentences",version="3.2.1"}
 | ||
| 
 | ||
| Returns a generator over the sentences the span belongs to. This property is
 | ||
| only available when [sentence boundaries](/usage/linguistic-features#sbd) have
 | ||
| been set on the document by the `parser`, `senter`, `sentencizer` or some custom
 | ||
| function. It will raise an error otherwise.
 | ||
| 
 | ||
| If the span happens to cross sentence boundaries, all sentences the span
 | ||
| overlaps with will be returned.
 | ||
| 
 | ||
| > #### Example
 | ||
| >
 | ||
| > ```python
 | ||
| > doc = nlp("Give it back! He pleaded.")
 | ||
| > span = doc[2:4]
 | ||
| > assert len(span.sents) == 2
 | ||
| > ```
 | ||
| 
 | ||
| | Name        | Description                                                                |
 | ||
| | ----------- | -------------------------------------------------------------------------- |
 | ||
| | **RETURNS** | A generator yielding sentences this `Span` is a part of ~~Iterable[Span]~~ |
 | ||
| 
 | ||
| ## Attributes {id="attributes"}
 | ||
| 
 | ||
| | Name           | Description                                                                                                                   |
 | ||
| | -------------- | ----------------------------------------------------------------------------------------------------------------------------- |
 | ||
| | `doc`          | The parent document. ~~Doc~~                                                                                                  |
 | ||
| | `tensor`       | The span's slice of the parent `Doc`'s tensor. ~~numpy.ndarray~~                                                              |
 | ||
| | `start`        | The token offset for the start of the span. ~~int~~                                                                           |
 | ||
| | `end`          | The token offset for the end of the span. ~~int~~                                                                             |
 | ||
| | `start_char`   | The character offset for the start of the span. ~~int~~                                                                       |
 | ||
| | `end_char`     | The character offset for the end of the span. ~~int~~                                                                         |
 | ||
| | `text`         | A string representation of the span text. ~~str~~                                                                             |
 | ||
| | `text_with_ws` | The text content of the span with a trailing whitespace character if the last token has one. ~~str~~                          |
 | ||
| | `orth`         | ID of the verbatim text content. ~~int~~                                                                                      |
 | ||
| | `orth_`        | Verbatim text content (identical to `Span.text`). Exists mostly for consistency with the other attributes. ~~str~~            |
 | ||
| | `label`        | The hash value of the span's label. ~~int~~                                                                                   |
 | ||
| | `label_`       | The span's label. ~~str~~                                                                                                     |
 | ||
| | `lemma_`       | The span's lemma. Equivalent to `"".join(token.text_with_ws for token in span)`. ~~str~~                                      |
 | ||
| | `kb_id`        | The hash value of the knowledge base ID referred to by the span. ~~int~~                                                      |
 | ||
| | `kb_id_`       | The knowledge base ID referred to by the span. ~~str~~                                                                        |
 | ||
| | `ent_id`       | Alias for `id`: the hash value of the span's ID. ~~int~~                                                                      |
 | ||
| | `ent_id_`      | Alias for `id_`: the span's ID. ~~str~~                                                                                       |
 | ||
| | `id`           | The hash value of the span's ID. ~~int~~                                                                                      |
 | ||
| | `id_`          | The span's ID. ~~str~~                                                                                                        |
 | ||
| | `_`            | User space for adding custom [attribute extensions](/usage/processing-pipelines#custom-components-attributes). ~~Underscore~~ |
 |