mirror of
https://github.com/explosion/spaCy.git
synced 2025-04-21 01:21:58 +03:00
add spancat.v3
This commit is contained in:
parent
c13f9ec933
commit
e72eacd01d
|
@ -189,7 +189,7 @@ means that the task will always perform few-shot prompting under the hood.
|
|||
| `case_sensitive_matching` | Whether to search without case sensitivity. Defaults to `False`. ~~bool~~ |
|
||||
|
||||
Note that the `single_match` parameter, used in v1 and v2, is not supported
|
||||
anymore, as the CoT parsing takes care of this automatically.
|
||||
anymore, as the CoT parsing algorithm takes care of this automatically.
|
||||
|
||||
#### spacy.NER.v2 {id="ner-v2"}
|
||||
|
||||
|
@ -312,6 +312,37 @@ path = "ner_examples.yml"
|
|||
|
||||
The SpanCat task identifies potentially overlapping entities in text.
|
||||
|
||||
#### spacy.SpanCat.v3 {id="spancat-v3"}
|
||||
|
||||
The built-in SpanCat v3 task is a simple adaptation of the NER v3 task to
|
||||
support overlapping entities and store its annotations in `doc.spans`.
|
||||
|
||||
> #### Example config
|
||||
>
|
||||
> ```ini
|
||||
> [components.llm.task]
|
||||
> @llm_tasks = "spacy.SpanCat.v3"
|
||||
> labels = ["PERSON", "ORGANISATION", "LOCATION"]
|
||||
> examples = null
|
||||
> ```
|
||||
|
||||
| Argument | Description |
|
||||
| ------------------------- | ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------ |
|
||||
| `labels` | List of labels or str of comma-separated list of labels. ~~Union[List[str], str]~~ |
|
||||
| `label_definitions` | Optional dict mapping a label to a description of that label. These descriptions are added to the prompt to help instruct the LLM on what to extract. Defaults to `None`. ~~Optional[Dict[str, str]]~~ |
|
||||
| `template` | Custom prompt template to send to LLM model. Defaults to [`spancat.v3.jinja`](https://github.com/explosion/spacy-llm/blob/main/spacy_llm/tasks/templates/spancat.v3.jinja). ~~str~~ |
|
||||
| `description` (NEW) | A description of what to recognize or not recognize as entities. ~~str~~ |
|
||||
| `spans_key` | Key of the `Doc.spans` dict to save the spans under. Defaults to `"sc"`. ~~str~~ |
|
||||
| `examples` | Optional function that generates examples for few-shot learning. Defaults to `None`. ~~Optional[Callable[[], Iterable[Any]]]~~ |
|
||||
| `normalizer` | Function that normalizes the labels as returned by the LLM. If `None`, defaults to `spacy.LowercaseNormalizer.v1`. ~~Optional[Callable[[str], str]]~~ |
|
||||
| `alignment_mode` | Alignment mode in case the LLM returns entities that do not align with token boundaries. Options are `"strict"`, `"contract"` or `"expand"`. Defaults to `"contract"`. ~~str~~ |
|
||||
| `case_sensitive_matching` | Whether to search without case sensitivity. Defaults to `False`. ~~bool~~ |
|
||||
|
||||
Note that the `single_match` parameter, used in v1 and v2, is not supported
|
||||
anymore, as the CoT parsing algorithm takes care of this automatically.
|
||||
|
||||
# TODO: check_label_consistency ?
|
||||
|
||||
#### spacy.SpanCat.v2 {id="spancat-v2"}
|
||||
|
||||
The built-in SpanCat v2 task is a simple adaptation of the NER v2 task to
|
||||
|
@ -329,8 +360,8 @@ support overlapping entities and store its annotations in `doc.spans`.
|
|||
| Argument | Description |
|
||||
| ------------------------- | ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------ |
|
||||
| `labels` | List of labels or str of comma-separated list of labels. ~~Union[List[str], str]~~ |
|
||||
| `template` (NEW) | Custom prompt template to send to LLM model. Defaults to [`spancat.v2.jinja`](https://github.com/explosion/spacy-llm/blob/main/spacy_llm/tasks/templates/spancat.v2.jinja). ~~str~~ |
|
||||
| `label_definitions` (NEW) | Optional dict mapping a label to a description of that label. These descriptions are added to the prompt to help instruct the LLM on what to extract. Defaults to `None`. ~~Optional[Dict[str, str]]~~ |
|
||||
| `template` (NEW) | Custom prompt template to send to LLM model. Defaults to [`spancat.v2.jinja`](https://github.com/explosion/spacy-llm/blob/main/spacy_llm/tasks/templates/spancat.v2.jinja). ~~str~~ |
|
||||
| `spans_key` | Key of the `Doc.spans` dict to save the spans under. Defaults to `"sc"`. ~~str~~ |
|
||||
| `examples` | Optional function that generates examples for few-shot learning. Defaults to `None`. ~~Optional[Callable[[], Iterable[Any]]]~~ |
|
||||
| `normalizer` | Function that normalizes the labels as returned by the LLM. If `None`, defaults to `spacy.LowercaseNormalizer.v1`. ~~Optional[Callable[[str], str]]~~ |
|
||||
|
|
|
@ -369,12 +369,13 @@ evaluate the component.
|
|||
| [`spacy.NER.v3`](/api/large-language-models#ner-v3 | Implements Chain-of-Thought reasoning for NER extraction - obtains higher accuracy than v1 or v2. |
|
||||
| [`spacy.NER.v2`](/api/large-language-models#ner-v2) | Builds on v1 and additionally supports defining the provided labels with explicit descriptions. |
|
||||
| [`spacy.NER.v1`](/api/large-language-models#ner-v1) | The original version of the built-in NER task supports both zero-shot and few-shot prompting. |
|
||||
| [`spacy.SpanCat.v3`](/api/large-language-models#spancat-v3) | Adaptation of the v3 NER task to support overlapping entities and store its annotations in `doc.spans`. |
|
||||
| [`spacy.SpanCat.v2`](/api/large-language-models#spancat-v2) | Adaptation of the v2 NER task to support overlapping entities and store its annotations in `doc.spans`. |
|
||||
| [`spacy.SpanCat.v1`](/api/large-language-models#spancat-v1) | Adaptation of the v1 NER task to support overlapping entities and store its annotations in `doc.spans`. |
|
||||
| [`spacy.REL.v1`](/api/large-language-models#rel-v1) | Relation Extraction task supporting both zero-shot and few-shot prompting. |
|
||||
| [`spacy.TextCat.v3`](/api/large-language-models#textcat-v3) | Version 3 builds on v2 and allows setting definitions of labels. |
|
||||
| [`spacy.TextCat.v2`](/api/large-language-models#textcat-v2) | Version 2 builds on v1 and includes an improved prompt template. |
|
||||
| [`spacy.TextCat.v1`](/api/large-language-models#textcat-v1) | Version 1 of the built-in TextCat task supports both zero-shot and few-shot prompting. |
|
||||
| [`spacy.REL.v1`](/api/large-language-models#rel-v1) | Relation Extraction task supporting both zero-shot and few-shot prompting. |
|
||||
| [`spacy.Lemma.v1`](/api/large-language-models#lemma-v1) | Lemmatizes the provided text and updates the `lemma_` attribute of the tokens accordingly. |
|
||||
| [`spacy.Sentiment.v1`](/api/large-language-models#sentiment-v1) | Performs sentiment analysis on provided texts. |
|
||||
| [`spacy.NoOp.v1`](/api/large-language-models#noop-v1) | This task is only useful for testing - it tells the LLM to do nothing, and does not set any fields on the `docs`. |
|
||||
|
|
Loading…
Reference in New Issue
Block a user