This commit is contained in:
Sofie Van Landeghem 2021-05-27 10:48:59 +02:00 committed by svlandeg
parent ee62344970
commit 4b81f58eda
2 changed files with 6 additions and 4 deletions

View File

@ -71,6 +71,8 @@ def offsets_to_biluo_tags(
entities (iterable): A sequence of `(start, end, label)` triples. `start` entities (iterable): A sequence of `(start, end, label)` triples. `start`
and `end` should be character-offset integers denoting the slice into and `end` should be character-offset integers denoting the slice into
the original string. the original string.
missing (str): The label used for missing values, e.g. if tokenization
doesnt align with the entity offsets. Defaults to "O".
RETURNS (list): A list of unicode strings, describing the tags. Each tag RETURNS (list): A list of unicode strings, describing the tags. Each tag
string will be of the form either "", "O" or "{action}-{label}", where string will be of the form either "", "O" or "{action}-{label}", where
action is one of "B", "I", "L", "U". The missing label is used where the action is one of "B", "I", "L", "U". The missing label is used where the
@ -150,7 +152,7 @@ def biluo_tags_to_spans(doc: Doc, tags: Iterable[str]) -> List[Span]:
to overwrite the doc.ents. to overwrite the doc.ents.
doc (Doc): The document that the BILUO tags refer to. doc (Doc): The document that the BILUO tags refer to.
entities (iterable): A sequence of BILUO tags with each tag describing one tags (iterable): A sequence of BILUO tags with each tag describing one
token. Each tag string will be of the form of either "", "O" or token. Each tag string will be of the form of either "", "O" or
"{action}-{label}", where action is one of "B", "I", "L", "U". "{action}-{label}", where action is one of "B", "I", "L", "U".
RETURNS (list): A sequence of Span objects. Each token with a missing IOB RETURNS (list): A sequence of Span objects. Each token with a missing IOB
@ -170,7 +172,7 @@ def biluo_tags_to_offsets(
"""Encode per-token tags following the BILUO scheme into entity offsets. """Encode per-token tags following the BILUO scheme into entity offsets.
doc (Doc): The document that the BILUO tags refer to. doc (Doc): The document that the BILUO tags refer to.
entities (iterable): A sequence of BILUO tags with each tag describing one tags (iterable): A sequence of BILUO tags with each tag describing one
token. Each tags string will be of the form of either "", "O" or token. Each tags string will be of the form of either "", "O" or
"{action}-{label}", where action is one of "B", "I", "L", "U". "{action}-{label}", where action is one of "B", "I", "L", "U".
RETURNS (list): A sequence of `(start, end, label)` triples. `start` and RETURNS (list): A sequence of `(start, end, label)` triples. `start` and

View File

@ -879,7 +879,7 @@ This method was previously available as `spacy.gold.offsets_from_biluo_tags`.
| Name | Description | | Name | Description |
| ----------- | ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------ | | ----------- | ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------ |
| `doc` | The document that the BILUO tags refer to. ~~Doc~~ | | `doc` | The document that the BILUO tags refer to. ~~Doc~~ |
| `entities` | A sequence of [BILUO](/usage/linguistic-features#accessing-ner) tags with each tag describing one token. Each tag string will be of the form of either `""`, `"O"` or `"{action}-{label}"`, where action is one of `"B"`, `"I"`, `"L"`, `"U"`. ~~List[str]~~ | | `tags` | A sequence of [BILUO](/usage/linguistic-features#accessing-ner) tags with each tag describing one token. Each tag string will be of the form of either `""`, `"O"` or `"{action}-{label}"`, where action is one of `"B"`, `"I"`, `"L"`, `"U"`. ~~List[str]~~ |
| **RETURNS** | A sequence of `(start, end, label)` triples. `start` and `end` will be character-offset integers denoting the slice into the original string. ~~List[Tuple[int, int, str]]~~ | | **RETURNS** | A sequence of `(start, end, label)` triples. `start` and `end` will be character-offset integers denoting the slice into the original string. ~~List[Tuple[int, int, str]]~~ |
### training.biluo_tags_to_spans {#biluo_tags_to_spans tag="function" new="2.1"} ### training.biluo_tags_to_spans {#biluo_tags_to_spans tag="function" new="2.1"}
@ -908,7 +908,7 @@ This method was previously available as `spacy.gold.spans_from_biluo_tags`.
| Name | Description | | Name | Description |
| ----------- | ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------ | | ----------- | ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------ |
| `doc` | The document that the BILUO tags refer to. ~~Doc~~ | | `doc` | The document that the BILUO tags refer to. ~~Doc~~ |
| `entities` | A sequence of [BILUO](/usage/linguistic-features#accessing-ner) tags with each tag describing one token. Each tag string will be of the form of either `""`, `"O"` or `"{action}-{label}"`, where action is one of `"B"`, `"I"`, `"L"`, `"U"`. ~~List[str]~~ | | `tags` | A sequence of [BILUO](/usage/linguistic-features#accessing-ner) tags with each tag describing one token. Each tag string will be of the form of either `""`, `"O"` or `"{action}-{label}"`, where action is one of `"B"`, `"I"`, `"L"`, `"U"`. ~~List[str]~~ |
| **RETURNS** | A sequence of `Span` objects with added entity labels. ~~List[Span]~~ | | **RETURNS** | A sequence of `Span` objects with added entity labels. ~~List[Span]~~ |
## Utility functions {#util source="spacy/util.py"} ## Utility functions {#util source="spacy/util.py"}