mirror of
https://github.com/explosion/spaCy.git
synced 2024-12-27 10:26:35 +03:00
de025beb5f
* test for error after Doc has been garbage collected * warn about using a SpanGroup when the Doc has been garbage collected * add warning to the docs * rephrase slightly * raise error instead of warning * update * move warning to doc property
196 lines
6.4 KiB
Markdown
196 lines
6.4 KiB
Markdown
---
|
|
title: SpanGroup
|
|
tag: class
|
|
source: spacy/tokens/span_group.pyx
|
|
new: 3
|
|
---
|
|
|
|
A group of arbitrary, potentially overlapping [`Span`](/api/span) objects that
|
|
all belong to the same [`Doc`](/api/doc) object. The group can be named, and you
|
|
can attach additional attributes to it. Span groups are generally accessed via
|
|
the [`Doc.spans`](/api/doc#spans) attribute, which will convert lists of spans
|
|
into a `SpanGroup` object for you automatically on assignment. `SpanGroup`
|
|
objects behave similar to `list`s, so you can append `Span` objects to them or
|
|
access a member at a given index.
|
|
|
|
## SpanGroup.\_\_init\_\_ {#init tag="method"}
|
|
|
|
Create a `SpanGroup`.
|
|
|
|
> #### Example
|
|
>
|
|
> ```python
|
|
> doc = nlp("Their goi ng home")
|
|
> spans = [doc[0:1], doc[2:4]]
|
|
>
|
|
> # Construction 1
|
|
> from spacy.tokens import SpanGroup
|
|
>
|
|
> group = SpanGroup(doc, name="errors", spans=spans, attrs={"annotator": "matt"})
|
|
> doc.spans["errors"] = group
|
|
>
|
|
> # Construction 2
|
|
> doc.spans["errors"] = spans
|
|
> assert isinstance(doc.spans["errors"], SpanGroup)
|
|
> ```
|
|
|
|
| Name | Description |
|
|
| -------------- | ---------------------------------------------------------------------------------------------------------------------------------------------------- |
|
|
| `doc` | The document the span group belongs to. ~~Doc~~ |
|
|
| _keyword-only_ | |
|
|
| `name` | The name of the span group. If the span group is created automatically on assignment to `doc.spans`, the key name is used. Defaults to `""`. ~~str~~ |
|
|
| `attrs` | Optional JSON-serializable attributes to attach to the span group. ~~Dict[str, Any]~~ |
|
|
| `spans` | The spans to add to the span group. ~~Iterable[Span]~~ |
|
|
|
|
## SpanGroup.doc {#doc tag="property"}
|
|
|
|
The [`Doc`](/api/doc) object the span group is referring to.
|
|
|
|
<Infobox title="SpanGroup and Doc lifecycle" variant="warning">
|
|
|
|
When a `Doc` object is garbage collected, any related `SpanGroup` object won't
|
|
be functional anymore, as these objects use a `weakref` to refer to the
|
|
document. An error will be raised as the internal `doc` object will be `None`.
|
|
To avoid this, make sure that the original `Doc` objects are still available in
|
|
the scope of your function.
|
|
|
|
</Infobox>
|
|
|
|
> #### Example
|
|
>
|
|
> ```python
|
|
> doc = nlp("Their goi ng home")
|
|
> doc.spans["errors"] = [doc[0:1], doc[2:4]]
|
|
> assert doc.spans["errors"].doc == doc
|
|
> ```
|
|
|
|
| Name | Description |
|
|
| ----------- | ------------------------------- |
|
|
| **RETURNS** | The reference document. ~~Doc~~ |
|
|
|
|
## SpanGroup.has_overlap {#has_overlap tag="property"}
|
|
|
|
Check whether the span group contains overlapping spans.
|
|
|
|
> #### Example
|
|
>
|
|
> ```python
|
|
> doc = nlp("Their goi ng home")
|
|
> doc.spans["errors"] = [doc[0:1], doc[2:4]]
|
|
> assert not doc.spans["errors"].has_overlap
|
|
> doc.spans["errors"].append(doc[1:2])
|
|
> assert doc.spans["errors"].has_overlap
|
|
> ```
|
|
|
|
| Name | Description |
|
|
| ----------- | -------------------------------------------------- |
|
|
| **RETURNS** | Whether the span group contains overlaps. ~~bool~~ |
|
|
|
|
## SpanGroup.\_\_len\_\_ {#len tag="method"}
|
|
|
|
Get the number of spans in the group.
|
|
|
|
> #### Example
|
|
>
|
|
> ```python
|
|
> doc = nlp("Their goi ng home")
|
|
> doc.spans["errors"] = [doc[0:1], doc[2:4]]
|
|
> assert len(doc.spans["errors"]) == 2
|
|
> ```
|
|
|
|
| Name | Description |
|
|
| ----------- | ----------------------------------------- |
|
|
| **RETURNS** | The number of spans in the group. ~~int~~ |
|
|
|
|
## SpanGroup.\_\_getitem\_\_ {#getitem tag="method"}
|
|
|
|
Get a span from the group.
|
|
|
|
> #### Example
|
|
>
|
|
> ```python
|
|
> doc = nlp("Their goi ng home")
|
|
> doc.spans["errors"] = [doc[0:1], doc[2:4]]
|
|
> span = doc.spans["errors"][1]
|
|
> assert span.text == "goi ng"
|
|
> ```
|
|
|
|
| Name | Description |
|
|
| ----------- | ------------------------------------- |
|
|
| `i` | The item index. ~~int~~ |
|
|
| **RETURNS** | The span at the given index. ~~Span~~ |
|
|
|
|
## SpanGroup.append {#append tag="method"}
|
|
|
|
Add a [`Span`](/api/span) object to the group. The span must refer to the same
|
|
[`Doc`](/api/doc) object as the span group.
|
|
|
|
> #### Example
|
|
>
|
|
> ```python
|
|
> doc = nlp("Their goi ng home")
|
|
> doc.spans["errors"] = [doc[0:1]]
|
|
> doc.spans["errors"].append(doc[2:4])
|
|
> assert len(doc.spans["errors"]) == 2
|
|
> ```
|
|
|
|
| Name | Description |
|
|
| ------ | ---------------------------- |
|
|
| `span` | The span to append. ~~Span~~ |
|
|
|
|
## SpanGroup.extend {#extend tag="method"}
|
|
|
|
Add multiple [`Span`](/api/span) objects to the group. All spans must refer to
|
|
the same [`Doc`](/api/doc) object as the span group.
|
|
|
|
> #### Example
|
|
>
|
|
> ```python
|
|
> doc = nlp("Their goi ng home")
|
|
> doc.spans["errors"] = []
|
|
> doc.spans["errors"].extend([doc[2:4], doc[0:1]])
|
|
> assert len(doc.spans["errors"]) == 2
|
|
> ```
|
|
|
|
| Name | Description |
|
|
| ------- | ------------------------------------ |
|
|
| `spans` | The spans to add. ~~Iterable[Span]~~ |
|
|
|
|
## SpanGroup.to_bytes {#to_bytes tag="method"}
|
|
|
|
Serialize the span group to a bytestring.
|
|
|
|
> #### Example
|
|
>
|
|
> ```python
|
|
> doc = nlp("Their goi ng home")
|
|
> doc.spans["errors"] = [doc[0:1], doc[2:4]]
|
|
> group_bytes = doc.spans["errors"].to_bytes()
|
|
> ```
|
|
|
|
| Name | Description |
|
|
| ----------- | ------------------------------------- |
|
|
| **RETURNS** | The serialized `SpanGroup`. ~~bytes~~ |
|
|
|
|
## SpanGroup.from_bytes {#from_bytes tag="method"}
|
|
|
|
Load the span group from a bytestring. Modifies the object in place and returns
|
|
it.
|
|
|
|
> #### Example
|
|
>
|
|
> ```python
|
|
> from spacy.tokens import SpanGroup
|
|
>
|
|
> doc = nlp("Their goi ng home")
|
|
> doc.spans["errors"] = [doc[0:1], doc[2:4]]
|
|
> group_bytes = doc.spans["errors"].to_bytes()
|
|
> new_group = SpanGroup()
|
|
> new_group.from_bytes(group_bytes)
|
|
> ```
|
|
|
|
| Name | Description |
|
|
| ------------ | ------------------------------------- |
|
|
| `bytes_data` | The data to load from. ~~bytes~~ |
|
|
| **RETURNS** | The `SpanGroup` object. ~~SpanGroup~~ |
|