Create a Span object from the slice doc[start : end].
Example
doc=nlp("Give it back! He pleaded.")span=doc[1:4]assert[t.textfortinspan]==["it","back","!"]
Name
Description
doc
The parent document. Doc
start
The index of the first token of the span. int
end
The index of the first token after the span. int
label
A label to attach to the span, e.g. for named entities. Union[str, int]
kb_id
A knowledge base ID to attach to the span, e.g. for named entities. Union[str, int]
vector
A meaning representation of the span. numpy.ndarray[ndim=1, dtype=float32]
Span.__getitem__
Get a Token object.
Example
doc=nlp("Give it back! He pleaded.")span=doc[1:4]assertspan[1].text=="back"
Name
Description
i
The index of the token within the span. int
RETURNS
The token at span[i]. Token
Get a Span object.
Example
doc=nlp("Give it back! He pleaded.")span=doc[1:4]assertspan[1:3].text=="back!"
Name
Description
start_end
The slice of the span to get. Tuple[int, int]
RETURNS
The span at span[start : end]. Span
Span.__iter__
Iterate over Token objects.
Example
doc=nlp("Give it back! He pleaded.")span=doc[1:4]assert[t.textfortinspan]==["it","back","!"]
Name
Description
YIELDS
A Token object. Token
Span.__len__
Get the number of tokens in the span.
Example
doc=nlp("Give it back! He pleaded.")span=doc[1:4]assertlen(span)==3
Name
Description
RETURNS
The number of tokens in the span. int
Span.set_extension
Define a custom attribute on the Span which becomes available via Span._.
For details, see the documentation on
custom attributes.
Example
fromspacy.tokensimportSpancity_getter=lambdaspan:any(cityinspan.textforcityin("New York","Paris","Berlin"))Span.set_extension("has_city",getter=city_getter)doc=nlp("I like New York in Autumn")assertdoc[1:4]._.has_city
Name
Description
name
Name of the attribute to set by the extension. For example, "my_attr" will be available as span._.my_attr. str
default
Optional default value of the attribute if no getter or method is defined. Optional[Any]
method
Set a custom method on the object, for example span._.compare(other_span). Optional[CallableSpan, ...], Any
getter
Getter function that takes the object and returns an attribute value. Is called when the user accesses the ._ attribute. Optional[CallableSpan], Any
setter
Setter function that takes the Span and a value, and modifies the object. Is called when the user writes to the Span._ attribute. Optional[CallableSpan, Any], None
force
Force overwriting existing attribute. bool
Span.get_extension
Look up a previously registered extension by name. Returns a 4-tuple
(default, method, getter, setter) if the extension is registered. Raises a
KeyError otherwise.
A (default, method, getter, setter) tuple of the removed extension. Tuple[Optional[Any], Optional[Callable], Optional[Callable], Optional[Callable]]
Span.char_span
Create a Span object from the slice span.text[start:end]. Returns None if
the character indices don't map to a valid span.
Example
doc=nlp("I like New York")span=doc[1:4].char_span(5,13,label="GPE")assertspan.text=="New York"
Name
Description
start
The index of the first character of the span. int
end
The index of the last character after the span. ~int~~
label
A label to attach to the span, e.g. for named entities. Union[int, str]
kb_id 2.2
An ID from a knowledge base to capture the meaning of a named entity. Union[int, str]
vector
A meaning representation of the span. numpy.ndarray[ndim=1, dtype=float32]
RETURNS
The newly constructed object or None. Optional[Span]
Span.similarity
Make a semantic similarity estimate. The default estimate is cosine similarity
using an average of word vectors.
Example
doc=nlp("green apples and red oranges")green_apples=doc[:2]red_oranges=doc[3:]apples_oranges=green_apples.similarity(red_oranges)oranges_apples=red_oranges.similarity(green_apples)assertapples_oranges==oranges_apples
Name
Description
other
The object to compare with. By default, accepts Doc, Span, Token and Lexeme objects. Union[Doc, Span, Token, Lexeme]
RETURNS
A scalar similarity score. Higher is more similar. float
Span.get_lca_matrix
Calculates the lowest common ancestor matrix for a given Span. Returns LCA
matrix containing the integer index of the ancestor, or -1 if no common
ancestor is found, e.g. if span excludes a necessary ancestor.
Example
doc=nlp("I like New York in Autumn")span=doc[1:4]matrix=span.get_lca_matrix()# array([[0, 0, 0], [0, 1, 2], [0, 2, 2]], dtype=int32)
Name
Description
RETURNS
The lowest common ancestor matrix of the Span. numpy.ndarray[ndim=2, dtype=int32]
Span.to_array
Given a list of M attribute IDs, export the tokens to a numpy ndarray of
shape (N, M), where N is the length of the document. The values will be
32-bit integers.
Example
fromspacy.attrsimportLOWER,POS,ENT_TYPE,IS_ALPHAdoc=nlp("I like New York in Autumn.")span=doc[2:3]# All strings mapped to integers, for easy export to numpynp_array=span.to_array([LOWER,POS,ENT_TYPE,IS_ALPHA])
Name
Description
attr_ids
A list of attributes (int IDs or string names) or a single attribute (int ID or string name). Union[int, str, List[Union[int, str]]]
RETURNS
The exported attributes as a numpy array. Union[numpy.ndarray[ndim=2, dtype=uint64], numpy.ndarray[ndim=1, dtype=uint64]]
Span.ents
The named entities in the span. Returns a tuple of named entity Span objects,
if the entity recognizer has been applied.
Example
doc=nlp("Mr. Best flew to New York on Saturday morning.")span=doc[0:6]ents=list(span.ents)assertents[0].label==346assertents[0].label_=="PERSON"assertents[0].text=="Mr. Best"
Name
Description
RETURNS
Entities in the span, one Span per entity. Tuple[Span, ...]
Span.as_doc
Create a new Doc object corresponding to the Span, with a copy of the data.
Example
doc=nlp("I like New York in Autumn.")span=doc[2:4]doc2=span.as_doc()assertdoc2.text=="New York"
Name
Description
copy_user_data
Whether or not to copy the original doc's user data. bool
RETURNS
A Doc object of the Span's content. Doc
Span.root
The token with the shortest path to the root of the sentence (or the root
itself). If multiple tokens are equally high in the tree, the first token is
taken.
Example
doc=nlp("I like New York in Autumn.")i,like,new,york,in_,autumn,dot=range(len(doc))assertdoc[new].head.text=="York"assertdoc[york].head.text=="like"new_york=doc[new:york+1]assertnew_york.root.text=="York"
Name
Description
RETURNS
The root token. Token
Span.conjuncts
A tuple of tokens coordinated to span.root.
Example
doc=nlp("I like apples and oranges")apples_conjuncts=doc[2:3].conjunctsassert[t.textfortinapples_conjuncts]==["oranges"]
Name
Description
RETURNS
The coordinated tokens. Tuple[Token, ...]
Span.lefts
Tokens that are to the left of the span, whose heads are within the span.
Example
doc=nlp("I like New York in Autumn.")lefts=[t.textfortindoc[3:7].lefts]assertlefts==["New"]
Name
Description
YIELDS
A left-child of a token of the span. Token
Span.rights
Tokens that are to the right of the span, whose heads are within the span.
Example
doc=nlp("I like New York in Autumn.")rights=[t.textfortindoc[2:4].rights]assertrights==["in"]
Name
Description
YIELDS
A right-child of a token of the span. Token
Span.n_lefts
The number of tokens that are to the left of the span, whose heads are within
the span.
Example
doc=nlp("I like New York in Autumn.")assertdoc[3:7].n_lefts==1
Name
Description
RETURNS
The number of left-child tokens. int
Span.n_rights
The number of tokens that are to the right of the span, whose heads are within
the span.
Example
doc=nlp("I like New York in Autumn.")assertdoc[2:4].n_rights==1
Name
Description
RETURNS
The number of right-child tokens. int
Span.subtree
Tokens within the span and tokens which descend from them.
Example
doc=nlp("Give it back! He pleaded.")subtree=[t.textfortindoc[:3].subtree]assertsubtree==["Give","it","back","!"]
Name
Description
YIELDS
A token within the span, or a descendant from it. Token
Span.has_vector
A boolean value indicating whether a word vector is associated with the object.
Example
doc=nlp("I like apples")assertdoc[1:].has_vector
Name
Description
RETURNS
Whether the span has a vector data attached. bool
Span.vector
A real-valued meaning representation. Defaults to an average of the token
vectors.
Example
doc=nlp("I like apples")assertdoc[1:].vector.dtype=="float32"assertdoc[1:].vector.shape==(300,)
Name
Description
RETURNS
A 1-dimensional array representing the span's vector. `numpy.ndarray[ndim=1, dtype=float32]
Span.vector_norm
The L2 norm of the span's vector representation.
Example
doc=nlp("I like apples")doc[1:].vector_norm# 4.800883928527915doc[2:].vector_norm# 6.895897646384268assertdoc[1:].vector_norm!=doc[2:].vector_norm
Name
Description
RETURNS
The L2 norm of the vector representation. float
Attributes
Name
Description
doc
The parent document. Doc
tensor 2.1.7
The span's slice of the parent Doc's tensor. numpy.ndarray
sent
The sentence span that this span is a part of. Span
start
The token offset for the start of the span. int
end
The token offset for the end of the span. int
start_char
The character offset for the start of the span. int
end_char
The character offset for the end of the span. int
text
A string representation of the span text. str
text_with_ws
The text content of the span with a trailing whitespace character if the last token has one. str
orth
ID of the verbatim text content. int
orth_
Verbatim text content (identical to Span.text). Exists mostly for consistency with the other attributes. str
label
The hash value of the span's label. int
label_
The span's label. str
lemma_
The span's lemma. Equivalent to "".join(token.text_with_ws for token in span). str
kb_id
The hash value of the knowledge base ID referred to by the span. int
kb_id_
The knowledge base ID referred to by the span. str
ent_id
The hash value of the named entity the token is an instance of. int
ent_id_
The string ID of the named entity the token is an instance of. str
sentiment
A scalar value indicating the positivity or negativity of the span. float