mirror of
https://github.com/explosion/spaCy.git
synced 2025-06-29 17:33:10 +03:00
v3 overview of parameters
This commit is contained in:
parent
e5f561565b
commit
c13f9ec933
|
@ -108,8 +108,8 @@ prompting.
|
||||||
> ```
|
> ```
|
||||||
|
|
||||||
| Argument | Description |
|
| Argument | Description |
|
||||||
| ------------- | --------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
|
| ------------- | --------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
|
||||||
| `template` | Custom prompt template to send to LLM model. Default templates for each task are located in the `spacy_llm/tasks/templates` directory. Defaults to [summarization.v1.jinja](https://github.com/explosion/spacy-llm/blob/main/spacy_llm/tasks/templates/summarization.v1.jinja). ~~str~~ |
|
| `template` | Custom prompt template to send to LLM model. Defaults to [summarization.v1.jinja](https://github.com/explosion/spacy-llm/blob/main/spacy_llm/tasks/templates/summarization.v1.jinja). ~~str~~ |
|
||||||
| `examples` | Optional function that generates examples for few-shot learning. Defaults to `None`. ~~Optional[Callable[[], Iterable[Any]]]~~ |
|
| `examples` | Optional function that generates examples for few-shot learning. Defaults to `None`. ~~Optional[Callable[[], Iterable[Any]]]~~ |
|
||||||
| `max_n_words` | Maximum number of words to be used in summary. Note that this should not expected to work exactly. Defaults to `None`. ~~Optional[int]~~ |
|
| `max_n_words` | Maximum number of words to be used in summary. Note that this should not expected to work exactly. Defaults to `None`. ~~Optional[int]~~ |
|
||||||
| `field` | Name of extension attribute to store summary in (i. e. the summary will be available in `doc._.{field}`). Defaults to `summary`. ~~str~~ |
|
| `field` | Name of extension attribute to store summary in (i. e. the summary will be available in `doc._.{field}`). Defaults to `summary`. ~~str~~ |
|
||||||
|
@ -157,6 +157,40 @@ path = "summarization_examples.yml"
|
||||||
|
|
||||||
The NER task identifies non-overlapping entities in text.
|
The NER task identifies non-overlapping entities in text.
|
||||||
|
|
||||||
|
#### spacy.NER.v3 {id="ner-v3"}
|
||||||
|
|
||||||
|
Version 3 is fundamentally different to v1 and v2, as it implements
|
||||||
|
Chain-of-Thought prompting, based on
|
||||||
|
[the PromptNER paper by Ashok and Lipton (2023)](https://arxiv.org/pdf/2305.15444.pdf).
|
||||||
|
From preliminary experiments, we've found this implementation to obtain
|
||||||
|
significant better accuracy.
|
||||||
|
|
||||||
|
> #### Example config
|
||||||
|
>
|
||||||
|
> ```ini
|
||||||
|
> [components.llm.task]
|
||||||
|
> @llm_tasks = "spacy.NER.v3"
|
||||||
|
> labels = ["PERSON", "ORGANISATION", "LOCATION"]
|
||||||
|
> ```
|
||||||
|
|
||||||
|
When no examples are [specified](/usage/large-language-models#few-shot-prompts),
|
||||||
|
the v3 implementation will use a dummy example in the prompt. Technically this
|
||||||
|
means that the task will always perform few-shot prompting under the hood.
|
||||||
|
|
||||||
|
| Argument | Description |
|
||||||
|
| ------------------------- | ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------ |
|
||||||
|
| `labels` | List of labels or str of comma-separated list of labels. ~~Union[List[str], str]~~ |
|
||||||
|
| `label_definitions` | Optional dict mapping a label to a description of that label. These descriptions are added to the prompt to help instruct the LLM on what to extract. Defaults to `None`. ~~Optional[Dict[str, str]]~~ |
|
||||||
|
| `template` | Custom prompt template to send to LLM model. Defaults to [ner.v3.jinja](https://github.com/explosion/spacy-llm/blob/main/spacy_llm/tasks/templates/ner.v3.jinja). ~~str~~ |
|
||||||
|
| `description` (NEW) | A description of what to recognize or not recognize as entities. ~~str~~ |
|
||||||
|
| `examples` | Optional function that generates examples for few-shot learning. Defaults to `None`. ~~Optional[Callable[[], Iterable[Any]]]~~ |
|
||||||
|
| `normalizer` | Function that normalizes the labels as returned by the LLM. If `None`, defaults to `spacy.LowercaseNormalizer.v1`. Defaults to `None`. ~~Optional[Callable[[str], str]]~~ |
|
||||||
|
| `alignment_mode` | Alignment mode in case the LLM returns entities that do not align with token boundaries. Options are `"strict"`, `"contract"` or `"expand"`. Defaults to `"contract"`. ~~str~~ |
|
||||||
|
| `case_sensitive_matching` | Whether to search without case sensitivity. Defaults to `False`. ~~bool~~ |
|
||||||
|
|
||||||
|
Note that the `single_match` parameter, used in v1 and v2, is not supported
|
||||||
|
anymore, as the CoT parsing takes care of this automatically.
|
||||||
|
|
||||||
#### spacy.NER.v2 {id="ner-v2"}
|
#### spacy.NER.v2 {id="ner-v2"}
|
||||||
|
|
||||||
This version supports explicitly defining the provided labels with custom
|
This version supports explicitly defining the provided labels with custom
|
||||||
|
@ -173,10 +207,10 @@ v1.
|
||||||
> ```
|
> ```
|
||||||
|
|
||||||
| Argument | Description |
|
| Argument | Description |
|
||||||
| ------------------------- | ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
|
| ------------------------- | ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------ |
|
||||||
| `labels` | List of labels or str of comma-separated list of labels. ~~Union[List[str], str]~~ |
|
| `labels` | List of labels or str of comma-separated list of labels. ~~Union[List[str], str]~~ |
|
||||||
| `template` (NEW) | Custom prompt template to send to LLM model. Default templates for each task are located in the `spacy_llm/tasks/templates` directory. Defaults to [ner.v2.jinja](https://github.com/explosion/spacy-llm/blob/main/spacy_llm/tasks/templates/ner.v2.jinja). ~~str~~ |
|
|
||||||
| `label_definitions` (NEW) | Optional dict mapping a label to a description of that label. These descriptions are added to the prompt to help instruct the LLM on what to extract. Defaults to `None`. ~~Optional[Dict[str, str]]~~ |
|
| `label_definitions` (NEW) | Optional dict mapping a label to a description of that label. These descriptions are added to the prompt to help instruct the LLM on what to extract. Defaults to `None`. ~~Optional[Dict[str, str]]~~ |
|
||||||
|
| `template` (NEW) | Custom prompt template to send to LLM model. Defaults to [ner.v2.jinja](https://github.com/explosion/spacy-llm/blob/main/spacy_llm/tasks/templates/ner.v2.jinja). ~~str~~ |
|
||||||
| `examples` | Optional function that generates examples for few-shot learning. Defaults to `None`. ~~Optional[Callable[[], Iterable[Any]]]~~ |
|
| `examples` | Optional function that generates examples for few-shot learning. Defaults to `None`. ~~Optional[Callable[[], Iterable[Any]]]~~ |
|
||||||
| `normalizer` | Function that normalizes the labels as returned by the LLM. If `None`, defaults to `spacy.LowercaseNormalizer.v1`. Defaults to `None`. ~~Optional[Callable[[str], str]]~~ |
|
| `normalizer` | Function that normalizes the labels as returned by the LLM. If `None`, defaults to `spacy.LowercaseNormalizer.v1`. Defaults to `None`. ~~Optional[Callable[[str], str]]~~ |
|
||||||
| `alignment_mode` | Alignment mode in case the LLM returns entities that do not align with token boundaries. Options are `"strict"`, `"contract"` or `"expand"`. Defaults to `"contract"`. ~~str~~ |
|
| `alignment_mode` | Alignment mode in case the LLM returns entities that do not align with token boundaries. Options are `"strict"`, `"contract"` or `"expand"`. Defaults to `"contract"`. ~~str~~ |
|
||||||
|
@ -293,9 +327,9 @@ support overlapping entities and store its annotations in `doc.spans`.
|
||||||
> ```
|
> ```
|
||||||
|
|
||||||
| Argument | Description |
|
| Argument | Description |
|
||||||
| ------------------------- | ----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
|
| ------------------------- | ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------ |
|
||||||
| `labels` | List of labels or str of comma-separated list of labels. ~~Union[List[str], str]~~ |
|
| `labels` | List of labels or str of comma-separated list of labels. ~~Union[List[str], str]~~ |
|
||||||
| `template` (NEW) | Custom prompt template to send to LLM model. Default templates for each task are located in the `spacy_llm/tasks/templates` directory. Defaults to [`spancat.v2.jinja`](https://github.com/explosion/spacy-llm/blob/main/spacy_llm/tasks/templates/spancat.v2.jinja). ~~str~~ |
|
| `template` (NEW) | Custom prompt template to send to LLM model. Defaults to [`spancat.v2.jinja`](https://github.com/explosion/spacy-llm/blob/main/spacy_llm/tasks/templates/spancat.v2.jinja). ~~str~~ |
|
||||||
| `label_definitions` (NEW) | Optional dict mapping a label to a description of that label. These descriptions are added to the prompt to help instruct the LLM on what to extract. Defaults to `None`. ~~Optional[Dict[str, str]]~~ |
|
| `label_definitions` (NEW) | Optional dict mapping a label to a description of that label. These descriptions are added to the prompt to help instruct the LLM on what to extract. Defaults to `None`. ~~Optional[Dict[str, str]]~~ |
|
||||||
| `spans_key` | Key of the `Doc.spans` dict to save the spans under. Defaults to `"sc"`. ~~str~~ |
|
| `spans_key` | Key of the `Doc.spans` dict to save the spans under. Defaults to `"sc"`. ~~str~~ |
|
||||||
| `examples` | Optional function that generates examples for few-shot learning. Defaults to `None`. ~~Optional[Callable[[], Iterable[Any]]]~~ |
|
| `examples` | Optional function that generates examples for few-shot learning. Defaults to `None`. ~~Optional[Callable[[], Iterable[Any]]]~~ |
|
||||||
|
@ -361,10 +395,10 @@ prompt.
|
||||||
> ```
|
> ```
|
||||||
|
|
||||||
| Argument | Description |
|
| Argument | Description |
|
||||||
| ------------------------- | ----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
|
| ------------------------- | ----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
|
||||||
| `labels` | List of labels or str of comma-separated list of labels. ~~Union[List[str], str]~~ |
|
| `labels` | List of labels or str of comma-separated list of labels. ~~Union[List[str], str]~~ |
|
||||||
| `label_definitions` (NEW) | Dictionary of label definitions. Included in the prompt, if set. Defaults to `None`. ~~Optional[Dict[str, str]]~~ |
|
| `label_definitions` (NEW) | Dictionary of label definitions. Included in the prompt, if set. Defaults to `None`. ~~Optional[Dict[str, str]]~~ |
|
||||||
| `template` | Custom prompt template to send to LLM model. Default templates for each task are located in the `spacy_llm/tasks/templates` directory. Defaults to [`textcat.v3.jinja`](https://github.com/explosion/spacy-llm/blob/main/spacy_llm/tasks/templates/textcat.v3.jinja). ~~str~~ |
|
| `template` | Custom prompt template to send to LLM model. Defaults to [`textcat.v3.jinja`](https://github.com/explosion/spacy-llm/blob/main/spacy_llm/tasks/templates/textcat.v3.jinja). ~~str~~ |
|
||||||
| `examples` | Optional function that generates examples for few-shot learning. Defaults to `None`. ~~Optional[Callable[[], Iterable[Any]]]~~ |
|
| `examples` | Optional function that generates examples for few-shot learning. Defaults to `None`. ~~Optional[Callable[[], Iterable[Any]]]~~ |
|
||||||
| `normalizer` | Function that normalizes the labels as returned by the LLM. If `None`, falls back to `spacy.LowercaseNormalizer.v1`. Defaults to `None`. ~~Optional[Callable[[str], str]]~~ |
|
| `normalizer` | Function that normalizes the labels as returned by the LLM. If `None`, falls back to `spacy.LowercaseNormalizer.v1`. Defaults to `None`. ~~Optional[Callable[[str], str]]~~ |
|
||||||
| `exclusive_classes` | If set to `True`, only one label per document should be valid. If set to `False`, one document can have multiple labels. Defaults to `False`. ~~bool~~ |
|
| `exclusive_classes` | If set to `True`, only one label per document should be valid. If set to `False`, one document can have multiple labels. Defaults to `False`. ~~bool~~ |
|
||||||
|
@ -388,9 +422,9 @@ V2 includes all v1 functionality, with an improved prompt template.
|
||||||
> ```
|
> ```
|
||||||
|
|
||||||
| Argument | Description |
|
| Argument | Description |
|
||||||
| ------------------- | ----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
|
| ------------------- | ----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
|
||||||
| `labels` | List of labels or str of comma-separated list of labels. ~~Union[List[str], str]~~ |
|
| `labels` | List of labels or str of comma-separated list of labels. ~~Union[List[str], str]~~ |
|
||||||
| `template` (NEW) | Custom prompt template to send to LLM model. Default templates for each task are located in the `spacy_llm/tasks/templates` directory. Defaults to [`textcat.v2.jinja`](https://github.com/explosion/spacy-llm/blob/main/spacy_llm/tasks/templates/textcat.v2.jinja). ~~str~~ |
|
| `template` (NEW) | Custom prompt template to send to LLM model. Defaults to [`textcat.v2.jinja`](https://github.com/explosion/spacy-llm/blob/main/spacy_llm/tasks/templates/textcat.v2.jinja). ~~str~~ |
|
||||||
| `examples` | Optional function that generates examples for few-shot learning. Defaults to `None`. ~~Optional[Callable[[], Iterable[Any]]]~~ |
|
| `examples` | Optional function that generates examples for few-shot learning. Defaults to `None`. ~~Optional[Callable[[], Iterable[Any]]]~~ |
|
||||||
| `normalizer` | Function that normalizes the labels as returned by the LLM. If `None`, falls back to `spacy.LowercaseNormalizer.v1`. ~~Optional[Callable[[str], str]]~~ |
|
| `normalizer` | Function that normalizes the labels as returned by the LLM. If `None`, falls back to `spacy.LowercaseNormalizer.v1`. ~~Optional[Callable[[str], str]]~~ |
|
||||||
| `exclusive_classes` | If set to `True`, only one label per document should be valid. If set to `False`, one document can have multiple labels. Defaults to `False`. ~~bool~~ |
|
| `exclusive_classes` | If set to `True`, only one label per document should be valid. If set to `False`, one document can have multiple labels. Defaults to `False`. ~~bool~~ |
|
||||||
|
@ -465,9 +499,9 @@ on an upstream NER component for entities extraction.
|
||||||
> ```
|
> ```
|
||||||
|
|
||||||
| Argument | Description |
|
| Argument | Description |
|
||||||
| ------------------- | --------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
|
| ------------------- | --------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
|
||||||
| `labels` | List of labels or str of comma-separated list of labels. ~~Union[List[str], str]~~ |
|
| `labels` | List of labels or str of comma-separated list of labels. ~~Union[List[str], str]~~ |
|
||||||
| `template` | Custom prompt template to send to LLM model. Default templates for each task are located in the `spacy_llm/tasks/templates` directory. Defaults to [`rel.v1.jinja`](https://github.com/explosion/spacy-llm/blob/main/spacy_llm/tasks/templates/rel.v1.jinja). ~~str~~ |
|
| `template` | Custom prompt template to send to LLM model. Defaults to [`rel.v3.jinja`](https://github.com/explosion/spacy-llm/blob/main/spacy_llm/tasks/templates/rel.v1.jinja). ~~str~~ |
|
||||||
| `label_description` | Dictionary providing a description for each relation label. Defaults to `None`. ~~Optional[Dict[str, str]]~~ |
|
| `label_description` | Dictionary providing a description for each relation label. Defaults to `None`. ~~Optional[Dict[str, str]]~~ |
|
||||||
| `examples` | Optional function that generates examples for few-shot learning. Defaults to `None`. ~~Optional[Callable[[], Iterable[Any]]]~~ |
|
| `examples` | Optional function that generates examples for few-shot learning. Defaults to `None`. ~~Optional[Callable[[], Iterable[Any]]]~~ |
|
||||||
| `normalizer` | Function that normalizes the labels as returned by the LLM. If `None`, falls back to `spacy.LowercaseNormalizer.v1`. Defaults to `None`. ~~Optional[Callable[[str], str]]~~ |
|
| `normalizer` | Function that normalizes the labels as returned by the LLM. If `None`, falls back to `spacy.LowercaseNormalizer.v1`. Defaults to `None`. ~~Optional[Callable[[str], str]]~~ |
|
||||||
|
@ -514,8 +548,8 @@ This task supports both zero-shot and few-shot prompting.
|
||||||
> ```
|
> ```
|
||||||
|
|
||||||
| Argument | Description |
|
| Argument | Description |
|
||||||
| ---------- | ----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
|
| ---------- | ----------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
|
||||||
| `template` | Custom prompt template to send to LLM model. Default templates for each task are located in the `spacy_llm/tasks/templates` directory. Defaults to [lemma.v1.jinja](https://github.com/explosion/spacy-llm/blob/main/spacy_llm/tasks/templates/lemma.v1.jinja). ~~str~~ |
|
| `template` | Custom prompt template to send to LLM model. Defaults to [lemma.v1.jinja](https://github.com/explosion/spacy-llm/blob/main/spacy_llm/tasks/templates/lemma.v1.jinja). ~~str~~ |
|
||||||
| `examples` | Optional function that generates examples for few-shot learning. Defaults to `None`. ~~Optional[Callable[[], Iterable[Any]]]~~ |
|
| `examples` | Optional function that generates examples for few-shot learning. Defaults to `None`. ~~Optional[Callable[[], Iterable[Any]]]~~ |
|
||||||
|
|
||||||
The task prompts the LLM to lemmatize the passed text and return the lemmatized
|
The task prompts the LLM to lemmatize the passed text and return the lemmatized
|
||||||
|
@ -591,8 +625,8 @@ This task supports both zero-shot and few-shot prompting.
|
||||||
> ```
|
> ```
|
||||||
|
|
||||||
| Argument | Description |
|
| Argument | Description |
|
||||||
| ---------- | -------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
|
| ---------- | ------------------------------------------------------------------------------------------------------------------------------------------ |
|
||||||
| `template` | Custom prompt template to send to LLM model. Default templates for each task are located in the `spacy_llm/tasks/templates` directory. Defaults to [sentiment.v1.jinja](./spacy_llm/tasks/templates/sentiment.v1.jinja). ~~str~~ |
|
| `template` | Custom prompt template to send to LLM model. Defaults to [sentiment.v1.jinja](./spacy_llm/tasks/templates/sentiment.v1.jinja). ~~str~~ |
|
||||||
| `examples` | Optional function that generates examples for few-shot learning. Defaults to `None`. ~~Optional[Callable[[], Iterable[Any]]]~~ |
|
| `examples` | Optional function that generates examples for few-shot learning. Defaults to `None`. ~~Optional[Callable[[], Iterable[Any]]]~~ |
|
||||||
| `field` | Name of extension attribute to store summary in (i. e. the summary will be available in `doc._.{field}`). Defaults to `sentiment`. ~~str~~ |
|
| `field` | Name of extension attribute to store summary in (i. e. the summary will be available in `doc._.{field}`). Defaults to `sentiment`. ~~str~~ |
|
||||||
|
|
||||||
|
@ -648,7 +682,9 @@ implementations can have other signatures, like
|
||||||
|
|
||||||
### Models via REST API {id="models-rest"}
|
### Models via REST API {id="models-rest"}
|
||||||
|
|
||||||
These models all take the same parameters, but note that the `config` should contain provider-specific keys and values, as it will be passed onwards to the provider's API.
|
These models all take the same parameters, but note that the `config` should
|
||||||
|
contain provider-specific keys and values, as it will be passed onwards to the
|
||||||
|
provider's API.
|
||||||
|
|
||||||
| Argument | Description |
|
| Argument | Description |
|
||||||
| ------------------ | ------------------------------------------------------------------------------------------------------------------------------------------------- |
|
| ------------------ | ------------------------------------------------------------------------------------------------------------------------------------------------- |
|
||||||
|
|
Loading…
Reference in New Issue
Block a user