mirror of
				https://github.com/explosion/spaCy.git
				synced 2025-10-25 21:21:10 +03:00 
			
		
		
		
	* Draft out initial Spans data structure * Initial span group commit * Basic span group support on Doc * Basic test for span group * Compile span_group.pyx * Draft addition of SpanGroup to DocBin * Add deserialization for SpanGroup * Add tests for serializing SpanGroup * Fix serialization of SpanGroup * Add EdgeC and GraphC structs * Add draft Graph data structure * Compile graph * More work on Graph * Update GraphC * Upd graph * Fix walk functions * Let Graph take nodes and edges on construction * Fix walking and getting * Add graph tests * Fix import * Add module with the SpanGroups dict thingy * Update test * Rename 'span_groups' attribute * Try to fix c++11 compilation * Fix test * Update DocBin * Try to fix compilation * Try to fix graph * Improve SpanGroup docstrings * Add doc.spans to documentation * Fix serialization * Tidy up and add docs * Update docs [ci skip] * Add SpanGroup.has_overlap * WIP updated Graph API * Start testing new Graph API * Update Graph tests * Update Graph * Add docstring Co-authored-by: Ines Montani <ines@ines.io>
		
			
				
	
	
		
			186 lines
		
	
	
		
			6.0 KiB
		
	
	
	
		
			Markdown
		
	
	
	
	
	
			
		
		
	
	
			186 lines
		
	
	
		
			6.0 KiB
		
	
	
	
		
			Markdown
		
	
	
	
	
	
| ---
 | |
| title: SpanGroup
 | |
| tag: class
 | |
| source: spacy/tokens/span_group.pyx
 | |
| new: 3
 | |
| ---
 | |
| 
 | |
| A group of arbitrary, potentially overlapping [`Span`](/api/span) objects that
 | |
| all belong to the same [`Doc`](/api/doc) object. The group can be named, and you
 | |
| can attach additional attributes to it. Span groups are generally accessed via
 | |
| the [`Doc.spans`](/api/doc#spans) attribute, which will convert lists of spans
 | |
| into a `SpanGroup` object for you automatically on assignment. `SpanGroup`
 | |
| objects behave similar to `list`s, so you can append `Span` objects to them or
 | |
| access a member at a given index.
 | |
| 
 | |
| ## SpanGroup.\_\_init\_\_ {#init tag="method"}
 | |
| 
 | |
| Create a `SpanGroup`.
 | |
| 
 | |
| > #### Example
 | |
| >
 | |
| > ```python
 | |
| > doc = nlp("Their goi ng home")
 | |
| > spans = [doc[0:1], doc[2:4]]
 | |
| >
 | |
| > # Construction 1
 | |
| > from spacy.tokens import SpanGroup
 | |
| >
 | |
| > group = SpanGroup(doc, name="errors", spans=spans, attrs={"annotator": "matt"})
 | |
| > doc.spans["errors"] = group
 | |
| >
 | |
| > # Construction 2
 | |
| > doc.spans["errors"] = spans
 | |
| > assert isinstance(doc.spans["errors"], SpanGroup)
 | |
| > ```
 | |
| 
 | |
| | Name           | Description                                                                                                                                          |
 | |
| | -------------- | ---------------------------------------------------------------------------------------------------------------------------------------------------- |
 | |
| | `doc`          | The document the span group belongs to. ~~Doc~~                                                                                                      |
 | |
| | _keyword-only_ |                                                                                                                                                      |
 | |
| | `name`         | The name of the span group. If the span group is created automatically on assignment to `doc.spans`, the key name is used. Defaults to `""`. ~~str~~ |
 | |
| | `attrs`        | Optional JSON-serializable attributes to attach to the span group. ~~Dict[str, Any]~~                                                                |
 | |
| | `spans`        | The spans to add to the span group. ~~Iterable[Span]~~                                                                                               |
 | |
| 
 | |
| ## SpanGroup.doc {#doc tag="property"}
 | |
| 
 | |
| The [`Doc`](/api/doc) object the span group is referring to.
 | |
| 
 | |
| > #### Example
 | |
| >
 | |
| > ```python
 | |
| > doc = nlp("Their goi ng home")
 | |
| > doc.spans["errors"] = [doc[0:1], doc[2:4]]
 | |
| > assert doc.spans["errors"].doc == doc
 | |
| > ```
 | |
| 
 | |
| | Name        | Description                     |
 | |
| | ----------- | ------------------------------- |
 | |
| | **RETURNS** | The reference document. ~~Doc~~ |
 | |
| 
 | |
| ## SpanGroup.has_overlap {#has_overlap tag="property"}
 | |
| 
 | |
| Check whether the span group contains overlapping spans.
 | |
| 
 | |
| > #### Example
 | |
| >
 | |
| > ```python
 | |
| > doc = nlp("Their goi ng home")
 | |
| > doc.spans["errors"] = [doc[0:1], doc[2:4]]
 | |
| > assert not doc.spans["errors"].has_overlap
 | |
| > doc.spans["errors"].append(doc[1:2])
 | |
| > assert doc.spans["errors"].has_overlap
 | |
| > ```
 | |
| 
 | |
| | Name        | Description                                        |
 | |
| | ----------- | -------------------------------------------------- |
 | |
| | **RETURNS** | Whether the span group contains overlaps. ~~bool~~ |
 | |
| 
 | |
| ## SpanGroup.\_\_len\_\_ {#len tag="method"}
 | |
| 
 | |
| Get the number of spans in the group.
 | |
| 
 | |
| > #### Example
 | |
| >
 | |
| > ```python
 | |
| > doc = nlp("Their goi ng home")
 | |
| > doc.spans["errors"] = [doc[0:1], doc[2:4]]
 | |
| > assert len(doc.spans["errors"]) == 2
 | |
| > ```
 | |
| 
 | |
| | Name        | Description                               |
 | |
| | ----------- | ----------------------------------------- |
 | |
| | **RETURNS** | The number of spans in the group. ~~int~~ |
 | |
| 
 | |
| ## SpanGroup.\_\_getitem\_\_ {#getitem tag="method"}
 | |
| 
 | |
| Get a span from the group.
 | |
| 
 | |
| > #### Example
 | |
| >
 | |
| > ```python
 | |
| > doc = nlp("Their goi ng home")
 | |
| > doc.spans["errors"] = [doc[0:1], doc[2:4]]
 | |
| > span = doc.spans["errors"][1]
 | |
| > assert span.text == "goi ng"
 | |
| > ```
 | |
| 
 | |
| | Name        | Description                           |
 | |
| | ----------- | ------------------------------------- |
 | |
| | `i`         | The item index. ~~int~~               |
 | |
| | **RETURNS** | The span at the given index. ~~Span~~ |
 | |
| 
 | |
| ## SpanGroup.append {#append tag="method"}
 | |
| 
 | |
| Add a [`Span`](/api/span) object to the group. The span must refer to the same
 | |
| [`Doc`](/api/doc) object as the span group.
 | |
| 
 | |
| > #### Example
 | |
| >
 | |
| > ```python
 | |
| > doc = nlp("Their goi ng home")
 | |
| > doc.spans["errors"] = [doc[0:1]]
 | |
| > doc.spans["errors"].append(doc[2:4])
 | |
| > assert len(doc.spans["errors"]) == 2
 | |
| > ```
 | |
| 
 | |
| | Name   | Description                  |
 | |
| | ------ | ---------------------------- |
 | |
| | `span` | The span to append. ~~Span~~ |
 | |
| 
 | |
| ## SpanGroup.extend {#extend tag="method"}
 | |
| 
 | |
| Add multiple [`Span`](/api/span) objects to the group. All spans must refer to
 | |
| the same [`Doc`](/api/doc) object as the span group.
 | |
| 
 | |
| > #### Example
 | |
| >
 | |
| > ```python
 | |
| > doc = nlp("Their goi ng home")
 | |
| > doc.spans["errors"] = []
 | |
| > doc.spans["errors"].extend([doc[2:4], doc[0:1]])
 | |
| > assert len(doc.spans["errors"]) == 2
 | |
| > ```
 | |
| 
 | |
| | Name    | Description                          |
 | |
| | ------- | ------------------------------------ |
 | |
| | `spans` | The spans to add. ~~Iterable[Span]~~ |
 | |
| 
 | |
| ## SpanGroup.to_bytes {#to_bytes tag="method"}
 | |
| 
 | |
| Serialize the span group to a bytestring.
 | |
| 
 | |
| > #### Example
 | |
| >
 | |
| > ```python
 | |
| > doc = nlp("Their goi ng home")
 | |
| > doc.spans["errors"] = [doc[0:1], doc[2:4]]
 | |
| > group_bytes = doc.spans["errors"].to_bytes()
 | |
| > ```
 | |
| 
 | |
| | Name        | Description                           |
 | |
| | ----------- | ------------------------------------- |
 | |
| | **RETURNS** | The serialized `SpanGroup`. ~~bytes~~ |
 | |
| 
 | |
| ## SpanGroup.from_bytes {#from_bytes tag="method"}
 | |
| 
 | |
| Load the span group from a bytestring. Modifies the object in place and returns
 | |
| it.
 | |
| 
 | |
| > #### Example
 | |
| >
 | |
| > ```python
 | |
| > from spacy.tokens import SpanGroup
 | |
| >
 | |
| > doc = nlp("Their goi ng home")
 | |
| > doc.spans["errors"] = [doc[0:1], doc[2:4]]
 | |
| > group_bytes = doc.spans["errors"].to_bytes()
 | |
| > new_group = SpanGroup()
 | |
| > new_group.from_bytes(group_bytes)
 | |
| > ```
 | |
| 
 | |
| | Name         | Description                           |
 | |
| | ------------ | ------------------------------------- |
 | |
| | `bytes_data` | The data to load from. ~~bytes~~      |
 | |
| | **RETURNS**  | The `SpanGroup` object. ~~SpanGroup~~ |
 |