diff --git a/website/docs/api/large-language-models.mdx b/website/docs/api/large-language-models.mdx
index 7b85f6658..235c8003d 100644
--- a/website/docs/api/large-language-models.mdx
+++ b/website/docs/api/large-language-models.mdx
@@ -1,6 +1,12 @@
---
title: Large Language Models
teaser: Integrating LLMs into structured NLP pipelines
+menu:
+ - ['Config', 'config']
+ - ['Tasks', 'tasks']
+ - ['Models', 'models']
+ - ['Cache', 'cache']
+ - ['Various Functions', 'various-functions']
---
[The spacy-llm package](https://github.com/explosion/spacy-llm) integrates Large
@@ -14,23 +20,23 @@ required.
`spacy-llm` exposes a `llm` factory that accepts the following configuration
options:
-| Argument | Description |
-| ---------------- | --------------------------------------------------------------------------------------------------------- |
-| `task` | An LLMTask can generate prompts and parse LLM responses. See [docs](#tasks). ~~Optional[LLMTask]~~ |
-| `backend` | Callable querying a specific LLM API. See [docs](#backends). ~~Callable[[Iterable[Any]], Iterable[Any]]~~ |
-| `cache` | Cache to use for caching prompts and responses per doc (batch). See [docs](#cache). ~~Cache~~ |
-| `save_io` | Whether to save prompts/responses within `Doc.user_data["llm_io"]`. ~~bool~~ |
-| `validate_types` | Whether to check if signatures of configured backend and task are consistent. ~~bool~~ |
+| Argument | Description |
+| ---------------- | ------------------------------------------------------------------------------------------------------- |
+| `task` | An LLMTask can generate prompts and parse LLM responses. See [docs](#tasks). ~~Optional[LLMTask]~~ |
+| `model` | Callable querying a specific LLM API. See [docs](#models). ~~Callable[[Iterable[Any]], Iterable[Any]]~~ |
+| `cache` | Cache to use for caching prompts and responses per doc (batch). See [docs](#cache). ~~Cache~~ |
+| `save_io` | Whether to save prompts/responses within `Doc.user_data["llm_io"]`. ~~bool~~ |
+| `validate_types` | Whether to check if signatures of configured backend and task are consistent. ~~bool~~ |
An `llm` component is defined by two main settings:
- A [**task**](#tasks), defining the prompt to send to the LLM as well as the
functionality to parse the resulting response back into structured fields on
spaCy's [Doc](https://spacy.io/api/doc) objects.
-- A [**backend**](#backends) defining the model to use and how to connect to it.
- Note that `spacy-llm` supports both access to external APIs (such as OpenAI)
- as well as access to self-hosted open-source LLMs (such as using Dolly through
- Hugging Face).
+- A [**model**](#models) defining the model and how to connect to it. Note that
+ `spacy-llm` supports both access to external APIs (such as OpenAI) as well as
+ access to self-hosted open-source LLMs (such as using Dolly through Hugging
+ Face).
Moreover, `spacy-llm` exposes a customizable [**caching**](#cache) functionality
to avoid running the same document through an LLM service (be it local or
@@ -63,7 +69,44 @@ Moreover, the task may define an optional
iterable of `Example`s as input and return a score dictionary. If the `scorer`
method is defined, `spacy-llm` will call it to evaluate the component.
-#### function task.generate_prompts {id="task-generate-prompts"}
+#### Providing examples for few-shot prompts {id="few-shot-prompts"}
+
+All built-in tasks support few-shot prompts, i. e. including examples in a
+prompt. Examples can be supplied in two ways: (1) as a separate file containing
+only examples or (2) by initializing `llm` with a `get_examples()` callback
+(like any other spaCy pipeline component).
+
+##### (1) Few-shot example file
+
+A file containing examples for few-shot prompting can be configured like this:
+
+```ini
+[components.llm.task]
+@llm_tasks = "spacy.NER.v2"
+labels = PERSON,ORGANISATION,LOCATION
+[components.llm.task.examples]
+@misc = "spacy.FewShotReader.v1"
+path = "ner_examples.yml"
+```
+
+The supplied file has to conform to the format expected by the required task
+(see the task documentation further down).
+
+##### (2) Initializing the `llm` component with a `get_examples()` callback
+
+Alternatively, you can initialize your `nlp` pipeline by providing a
+`get_examples` callback for
+[`nlp.initialize`](https://spacy.io/api/language#initialize) and setting
+`n_prompt_examples` to a positive number to automatically fetch a few examples
+for few-shot learning. Set `n_prompt_examples` to `-1` to use all examples as
+part of the few-shot learning prompt.
+
+```ini
+[initialize.components.llm]
+n_prompt_examples = 3
+```
+
+#### task.generate_prompts {id="task-generate-prompts"}
Takes a collection of documents, and returns a collection of "prompts", which
can be of type `Any`. Often, prompts are of type `str` - but this is not
@@ -74,7 +117,7 @@ enforced to allow for maximum flexibility in the framework.
| `docs` | The input documents. ~~Iterable[Doc]~~ |
| **RETURNS** | The generated prompts. ~~Iterable[Any]~~ |
-#### function task.parse_responses {id="task-parse-responses"}
+#### task.parse_responses {id="task-parse-responses"}
Takes a collection of LLM responses and the original documents, parses the
responses into structured information, and sets the annotations on the
@@ -83,7 +126,7 @@ way, including `Doc` fields like `ents`, `spans` or `cats`, or using custom
defined fields.
The `responses` are of type `Iterable[Any]`, though they will often be `str`
-objects. This depends on the return type of the [backend](#backends).
+objects. This depends on the return type of the [model](#models).
| Argument | Description |
| ----------- | ------------------------------------------ |
@@ -91,6 +134,67 @@ objects. This depends on the return type of the [backend](#backends).
| `responses` | The generated prompts. ~~Iterable[Any]~~ |
| **RETURNS** | The annotated documents. ~~Iterable[Doc]~~ |
+#### spacy.Summarization.v1 {id="summarization-v1"}
+
+The `spacy.Summarization.v1` task supports both zero-shot and few-shot
+prompting.
+
+> #### Example config
+>
+> ```ini
+> [components.llm.task]
+> @llm_tasks = "spacy.Summarization.v1"
+> examples = null
+> max_n_words = null
+> ```
+
+| Argument | Description |
+| ------------- | ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------ |
+| `template` | Custom prompt template to send to LLM backend. Default templates for each task are located in the `spacy_llm/tasks/templates` directory. Defaults to [summarization.jinja](./spacy_llm/tasks/templates/summarization.jinja). ~~str~~ |
+| `examples` | Optional function that generates examples for few-shot learning. Defaults to `None`. ~~Optional[Callable[[], Iterable[Any]]]~~ |
+| `max_n_words` | Maximum number of words to be used in summary. Note that this should not expected to work exactly. Defaults to `None`. ~~Optional[int]~~ |
+| `field` | Name of extension attribute to store summary in (i. e. the summary will be available in `doc._.{field}`). Defaults to `summary`. ~~str~~ |
+
+The summarization task prompts the model for a concise summary of the provided
+text. It optionally allows to limit the response to a certain number of tokens -
+note that this requirement will be included in the prompt, but the task doesn't
+perform a hard cut-off. It's hence possible that your summary exceeds
+`max_n_words`.
+
+To perform few-shot learning, you can write down a few examples in a separate
+file, and provide these to be injected into the prompt to the LLM. The default
+reader `spacy.FewShotReader.v1` supports `.yml`, `.yaml`, `.json` and `.jsonl`.
+
+```yaml
+- text: >
+ The United Nations, referred to informally as the UN, is an
+ intergovernmental organization whose stated purposes are to maintain
+ international peace and security, develop friendly relations among nations,
+ achieve international
+
+ cooperation, and serve as a centre for harmonizing the actions of nations.
+ It is the world's largest international organization. The UN is
+ headquartered on international territory in New York City, and the
+ organization has other offices in Geneva, Nairobi, Vienna, and The Hague,
+ where the International Court of Justice is headquartered.\n\n The UN was
+ established after World War II with the aim of preventing future world wars,
+ and succeeded the League of Nations, which was characterized as
+ ineffective.
+ summary:
+ 'The UN is an international organization that promotes global peace,
+ cooperation, and harmony. Established after WWII, its purpose is to prevent
+ future world wars.'
+```
+
+```ini
+[components.llm.task]
+@llm_tasks = "spacy.Summarization.v1"
+max_n_words = 20
+[components.llm.task.examples]
+@misc = "spacy.FewShotReader.v1"
+path = "summarization_examples.yml"
+```
+
#### spacy.NER.v2 {id="ner-v2"}
The built-in NER task supports both zero-shot and few-shot prompting. This
@@ -106,16 +210,16 @@ descriptions.
> examples = null
> ```
-| Argument | Description |
-| ------------------------- | ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
-| `labels` | List of labels or str of comma-separated list of labels. ~~Union[List[str], str]~~ |
-| `template` | Custom prompt template to send to LLM backend. Default templates for each task are located in the `spacy_llm/tasks/templates` directory. Defaults to [ner.v2.jinja](https://github.com/spacy-llm/spacy_llm/tasks/templates/ner.v2.jinja). ~~str~~ |
-| `label_definitions` | Optional dict mapping a label to a description of that label. These descriptions are added to the prompt to help instruct the LLM on what to extract. Defaults to `None`. ~~Optional[Dict[str, str]]~~ |
-| `examples` | Optional function that generates examples for few-shot learning. Defaults to `None`. ~~Optional[Callable[[], Iterable[Any]]]~~ |
-| `normalizer` | Function that normalizes the labels as returned by the LLM. If `None`, defaults to `spacy.LowercaseNormalizer.v1`. Defaults to `None`. ~~Optional[Callable[[str], str]]~~ |
-| `alignment_mode` | Alignment mode in case the LLM returns entities that do not align with token boundaries. Options are `"strict"`, `"contract"` or `"expand"`. Defaults to `"contract"`. ~~str~~ |
-| `case_sensitive_matching` | Whether to search without case sensitivity. Defaults to `False`. ~~bool~~ |
-| `single_match` | Whether to match an entity in the LLM's response only once (the first hit) or multiple times. Defaults to `False`. ~~bool~~ |
+| Argument | Description |
+| ------------------------- | ----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
+| `labels` | List of labels or str of comma-separated list of labels. ~~Union[List[str], str]~~ |
+| `template` | Custom prompt template to send to LLM model. Default templates for each task are located in the `spacy_llm/tasks/templates` directory. Defaults to [ner.v2.jinja](https://github.com/spacy-llm/spacy_llm/tasks/templates/ner.v2.jinja). ~~str~~ |
+| `label_definitions` | Optional dict mapping a label to a description of that label. These descriptions are added to the prompt to help instruct the LLM on what to extract. Defaults to `None`. ~~Optional[Dict[str, str]]~~ |
+| `examples` | Optional function that generates examples for few-shot learning. Defaults to `None`. ~~Optional[Callable[[], Iterable[Any]]]~~ |
+| `normalizer` | Function that normalizes the labels as returned by the LLM. If `None`, defaults to `spacy.LowercaseNormalizer.v1`. Defaults to `None`. ~~Optional[Callable[[str], str]]~~ |
+| `alignment_mode` | Alignment mode in case the LLM returns entities that do not align with token boundaries. Options are `"strict"`, `"contract"` or `"expand"`. Defaults to `"contract"`. ~~str~~ |
+| `case_sensitive_matching` | Whether to search without case sensitivity. Defaults to `False`. ~~bool~~ |
+| `single_match` | Whether to match an entity in the LLM's response only once (the first hit) or multiple times. Defaults to `False`. ~~bool~~ |
The NER task implementation doesn't currently ask the LLM for specific offsets,
but simply expects a list of strings that represent the enties in the document.
@@ -165,15 +269,14 @@ path = "ner_examples.yml"
```
> Label descriptions can also be used with explicit examples to give as much
-> info to the LLM backend as possible.
+> info to the LLM model as possible.
-If you don't have specific examples to provide to the LLM, you can write
-definitions for each label and provide them via the `label_definitions`
-argument. This lets you tell the LLM exactly what you're looking for rather than
-relying on the LLM to interpret its task given just the label name. Label
-descriptions are freeform so you can write whatever you want here, but through
-some experiments a brief description along with some examples and counter
-examples seems to work quite well.
+You can also write definitions for each label and provide them via the
+`label_definitions` argument. This lets you tell the LLM exactly what you're
+looking for rather than relying on the LLM to interpret its task given just the
+label name. Label descriptions are freeform so you can write whatever you want
+here, but through some experiments a brief description along with some examples
+and counter examples seems to work quite well.
```ini
[components.llm.task]
@@ -268,17 +371,17 @@ overlapping entities and store its annotations in `doc.spans`.
> examples = null
> ```
-| Argument | Description |
-| ------------------------- | ----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
-| `labels` | List of labels or str of comma-separated list of labels. ~~Union[List[str], str]~~ |
-| `template` | Custom prompt template to send to LLM backend. Default templates for each task are located in the `spacy_llm/tasks/templates` directory. Defaults to [`spancat.v2.jinja`](https://github.com/spacy-llm/spacy_llm/tasks/templates/spancat.v2.jinja). ~~str~~ |
-| `label_definitions` | Optional dict mapping a label to a description of that label. These descriptions are added to the prompt to help instruct the LLM on what to extract. Defaults to `None`. ~~Optional[Dict[str, str]]~~ |
-| `spans_key` | Key of the `Doc.spans` dict to save the spans under. Defaults to `"sc"`. ~~str~~ |
-| `examples` | Optional function that generates examples for few-shot learning. Defaults to `None`. ~~Optional[Callable[[], Iterable[Any]]]~~ |
-| `normalizer` | Function that normalizes the labels as returned by the LLM. If `None`, defaults to `spacy.LowercaseNormalizer.v1`. ~~Optional[Callable[[str], str]]~~ |
-| `alignment_mode` | Alignment mode in case the LLM returns entities that do not align with token boundaries. Options are `"strict"`, `"contract"` or `"expand"`. Defaults to `"contract"`. ~~str~~ |
-| `case_sensitive_matching` | Whether to search without case sensitivity. Defaults to `False`. ~~bool~~ |
-| `single_match` | Whether to match an entity in the LLM's response only once (the first hit) or multiple times. Defaults to `False`. ~~bool~~ |
+| Argument | Description |
+| ------------------------- | --------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
+| `labels` | List of labels or str of comma-separated list of labels. ~~Union[List[str], str]~~ |
+| `template` | Custom prompt template to send to LLM model. Default templates for each task are located in the `spacy_llm/tasks/templates` directory. Defaults to [`spancat.v2.jinja`](https://github.com/spacy-llm/spacy_llm/tasks/templates/spancat.v2.jinja). ~~str~~ |
+| `label_definitions` | Optional dict mapping a label to a description of that label. These descriptions are added to the prompt to help instruct the LLM on what to extract. Defaults to `None`. ~~Optional[Dict[str, str]]~~ |
+| `spans_key` | Key of the `Doc.spans` dict to save the spans under. Defaults to `"sc"`. ~~str~~ |
+| `examples` | Optional function that generates examples for few-shot learning. Defaults to `None`. ~~Optional[Callable[[], Iterable[Any]]]~~ |
+| `normalizer` | Function that normalizes the labels as returned by the LLM. If `None`, defaults to `spacy.LowercaseNormalizer.v1`. ~~Optional[Callable[[str], str]]~~ |
+| `alignment_mode` | Alignment mode in case the LLM returns entities that do not align with token boundaries. Options are `"strict"`, `"contract"` or `"expand"`. Defaults to `"contract"`. ~~str~~ |
+| `case_sensitive_matching` | Whether to search without case sensitivity. Defaults to `False`. ~~bool~~ |
+| `single_match` | Whether to match an entity in the LLM's response only once (the first hit) or multiple times. Defaults to `False`. ~~bool~~ |
Except for the `spans_key` parameter, the SpanCat task reuses the configuration
from the NER task. Refer to [its documentation](#ner-v2) for more insight.
@@ -335,7 +438,7 @@ definitions are included in the prompt.
| `labels` | List of labels or str of comma-separated list of labels. ~~Union[List[str], str]~~ |
| `label_definitions` | Dictionary of label definitions. Included in the prompt, if set. Defaults to `None`. ~~Optional[Dict[str, str]]~~ |
| `template` | Custom prompt template to send to LLM backend. Default templates for each task are located in the `spacy_llm/tasks/templates` directory. Defaults to [`textcat.jinja`](https://github.com/spacy-llm/spacy_llm/tasks/templates/textcat.jinja). ~~str~~ |
-| `examples` | Optional function that generates examples for few-shot learning. Defaults to `None`. Optional[Callable[[], Iterable[Any]]] |
+| `examples` | Optional function that generates examples for few-shot learning. Defaults to `None`. ~~Optional[Callable[[], Iterable[Any]]]~~ |
| `normalizer` | Function that normalizes the labels as returned by the LLM. If `None`, falls back to `spacy.LowercaseNormalizer.v1`. Defaults to `None`. ~~Optional[Callable[[str], str]]~~ |
| `exclusive_classes` | If set to `True`, only one label per document should be valid. If set to `False`, one document can have multiple labels. Defaults to `False`. ~~bool~~ |
| `allow_none` | When set to `True`, allows the LLM to not return any of the given label. The resulting dict in `doc.cats` will have `0.0` scores for all labels. Defaults to `True`. ~~bool~~ |
@@ -385,15 +488,15 @@ prompting and includes an improved prompt template.
> examples = null
> ```
-| Argument | Description |
-| ------------------- | ----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
-| `labels` | List of labels or str of comma-separated list of labels. ~~Union[List[str], str]~~ |
-| `template` | Custom prompt template to send to LLM backend. Default templates for each task are located in the `spacy_llm/tasks/templates` directory. Defaults to [`textcat.jinja`](https://github.com/spacy-llm/spacy_llm/tasks/templates/textcat.jinja). ~~str~~ |
-| `examples` | Optional function that generates examples for few-shot learning. Defaults to `None`. ~~Optional[Callable[[], Iterable[Any]]]~~ |
-| `normalizer` | Function that normalizes the labels as returned by the LLM. If `None`, falls back to `spacy.LowercaseNormalizer.v1`. ~~Optional[Callable[[str], str]]~~ |
-| `exclusive_classes` | If set to `True`, only one label per document should be valid. If set to `False`, one document can have multiple labels. Defaults to `False`. ~~bool~~ |
-| `allow_none` | When set to `True`, allows the LLM to not return any of the given label. The resulting dict in `doc.cats` will have `0.0` scores for all labels. Defaults to `True`. ~~bool~~ |
-| `verbose` | If set to `True`, warnings will be generated when the LLM returns invalid responses. Defaults to `False`. ~~bool~~ |
+| Argument | Description |
+| ------------------- | --------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
+| `labels` | List of labels or str of comma-separated list of labels. ~~Union[List[str], str]~~ |
+| `template` | Custom prompt template to send to LLM model. Default templates for each task are located in the `spacy_llm/tasks/templates` directory. Defaults to [`textcat.jinja`](https://github.com/spacy-llm/spacy_llm/tasks/templates/textcat.jinja). ~~str~~ |
+| `examples` | Optional function that generates examples for few-shot learning. Defaults to `None`. ~~Optional[Callable[[], Iterable[Any]]]~~ |
+| `normalizer` | Function that normalizes the labels as returned by the LLM. If `None`, falls back to `spacy.LowercaseNormalizer.v1`. ~~Optional[Callable[[str], str]]~~ |
+| `exclusive_classes` | If set to `True`, only one label per document should be valid. If set to `False`, one document can have multiple labels. Defaults to `False`. ~~bool~~ |
+| `allow_none` | When set to `True`, allows the LLM to not return any of the given label. The resulting dict in `doc.cats` will have `0.0` scores for all labels. Defaults to `True`. ~~bool~~ |
+| `verbose` | If set to `True`, warnings will be generated when the LLM returns invalid responses. Defaults to `False`. ~~bool~~ |
To perform few-shot learning, you can write down a few examples in a separate
file, and provide these to be injected into the prompt to the LLM. The default
@@ -483,14 +586,14 @@ on an upstream NER component for entities extraction.
> labels = ["LivesIn", "Visits"]
> ```
-| Argument | Description |
-| ------------------- | --------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
-| `labels` | List of labels or str of comma-separated list of labels. ~~Union[List[str], str]~~ |
-| `template` | Custom prompt template to send to LLM backend. Default templates for each task are located in the `spacy_llm/tasks/templates` directory. Defaults to [`rel.jinja`](https://github.com/spacy-llm/spacy_llm/tasks/templates/rel.jinja). ~~str~~ |
-| `label_description` | Dictionary providing a description for each relation label. Defaults to `None`. ~~Optional[Dict[str, str]]~~ |
-| `examples` | Optional function that generates examples for few-shot learning. Defaults to `None`. ~~Optional[Callable[[], Iterable[Any]]]~~ |
-| `normalizer` | Function that normalizes the labels as returned by the LLM. If `None`, falls back to `spacy.LowercaseNormalizer.v1`. Defaults to `None`. ~~Optional[Callable[[str], str]]~~ |
-| `verbose` | If set to `True`, warnings will be generated when the LLM returns invalid responses. Defaults to `False`. ~~bool~~ |
+| Argument | Description |
+| ------------------- | ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
+| `labels` | List of labels or str of comma-separated list of labels. ~~Union[List[str], str]~~ |
+| `template` | Custom prompt template to send to LLM model. Default templates for each task are located in the `spacy_llm/tasks/templates` directory. Defaults to [`rel.jinja`](https://github.com/spacy-llm/spacy_llm/tasks/templates/rel.jinja). ~~str~~ |
+| `label_description` | Dictionary providing a description for each relation label. Defaults to `None`. ~~Optional[Dict[str, str]]~~ |
+| `examples` | Optional function that generates examples for few-shot learning. Defaults to `None`. ~~Optional[Callable[[], Iterable[Any]]]~~ |
+| `normalizer` | Function that normalizes the labels as returned by the LLM. If `None`, falls back to `spacy.LowercaseNormalizer.v1`. Defaults to `None`. ~~Optional[Callable[[str], str]]~~ |
+| `verbose` | If set to `True`, warnings will be generated when the LLM returns invalid responses. Defaults to `False`. ~~bool~~ |
To perform few-shot learning, you can write down a few examples in a separate
file, and provide these to be injected into the prompt to the LLM. The default
@@ -501,10 +604,6 @@ reader `spacy.FewShotReader.v1` supports `.yml`, `.yaml`, `.json` and `.jsonl`.
{"text": "Michael travelled through South America by bike.", "ents": [{"start_char": 0, "end_char": 7, "label": "PERSON"}, {"start_char": 26, "end_char": 39, "label": "LOC"}], "relations": [{"dep": 0, "dest": 1, "relation": "Visits"}]}
```
-Note: the REL task relies on pre-extracted entities to make its prediction.
-Hence, you'll need to add a component that populates `doc.ents` with recognized
-spans to your spaCy pipeline and put it _before_ the REL component.
-
```ini
[components.llm.task]
@llm_tasks = "spacy.REL.v1"
@@ -514,6 +613,10 @@ labels = ["LivesIn", "Visits"]
path = "rel_examples.jsonl"
```
+Note: the REL task relies on pre-extracted entities to make its prediction.
+Hence, you'll need to add a component that populates `doc.ents` with recognized
+spans to your spaCy pipeline and put it _before_ the REL component.
+
#### spacy.Lemma.v1 {id="lemma-v1"}
The `Lemma.v1` task lemmatizes the provided text and updates the `lemma_`
@@ -527,10 +630,10 @@ attribute in the doc's tokens accordingly.
> examples = null
> ```
-| Argument | Description |
-| ---------- | ----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
-| `template` | Custom prompt template to send to LLM backend. Default templates for each task are located in the `spacy_llm/tasks/templates` directory. Defaults to [lemma.jinja](https://github.com/spacy-llm/spacy_llm/tasks/templates/lemma.jinja). ~~str~~ |
-| `examples` | Optional function that generates examples for few-shot learning. Defaults to `None`. ~~Optional[Callable[[], Iterable[Any]]]~~ |
+| Argument | Description |
+| ---------- | --------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
+| `template` | Custom prompt template to send to LLM model. Default templates for each task are located in the `spacy_llm/tasks/templates` directory. Defaults to [lemma.jinja](https://github.com/spacy-llm/spacy_llm/tasks/templates/lemma.jinja). ~~str~~ |
+| `examples` | Optional function that generates examples for few-shot learning. Defaults to `None`. ~~Optional[Callable[[], Iterable[Any]]]~~ |
`Lemma.v1` prompts the LLM to lemmatize the passed text and return the
lemmatized version as a list of tokens and their corresponding lemma. E. g. the
@@ -585,6 +688,49 @@ reader `spacy.FewShotReader.v1` supports `.yml`, `.yaml`, `.json` and `.jsonl`.
path = "lemma_examples.yml"
```
+#### spacy.Sentiment.v1 {id="sentiment-v1"}
+
+Performs sentiment analysis on provided texts. Scores between 0 and 1 are stored
+in `Doc._.sentiment` - the higher, the more positive. Note in cases of parsing
+issues (e. g. in case of unexpected LLM responses) the value might be `None`.
+
+> #### Example config
+>
+> ```ini
+> [components.llm.task]
+> @llm_tasks = "spacy.Sentiment.v1"
+> examples = null
+> ```
+
+| Argument | Description |
+| ---------- | -------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
+| `template` | Custom prompt template to send to LLM model. Default templates for each task are located in the `spacy_llm/tasks/templates` directory. Defaults to [sentiment.jinja](./spacy_llm/tasks/templates/sentiment.jinja). ~~str~~ |
+| `examples` | Optional function that generates examples for few-shot learning. Defaults to `None`. ~~Optional[Callable[[], Iterable[Any]]]~~ |
+| `field` | Name of extension attribute to store summary in (i. e. the summary will be available in `doc._.{field}`). Defaults to `sentiment`. ~~str~~ |
+
+To perform few-shot learning, you can write down a few examples in a separate
+file, and provide these to be injected into the prompt to the LLM. The default
+reader `spacy.FewShotReader.v1` supports `.yml`, `.yaml`, `.json` and `.jsonl`.
+
+```yaml
+- text: 'This is horrifying.'
+ score: 0
+- text: 'This is underwhelming.'
+ score: 0.25
+- text: 'This is ok.'
+ score: 0.5
+- text: "I'm looking forward to this!"
+ score: 1.0
+```
+
+```ini
+[components.llm.task]
+@llm_tasks = "spacy.Sentiment.v1"
+[components.llm.task.examples]
+@misc = "spacy.FewShotReader.v1"
+path = "sentiment_examples.yml"
+```
+
#### spacy.NoOp.v1 {id="noop-v1"}
> #### Example config
@@ -597,9 +743,9 @@ path = "lemma_examples.yml"
This task is only useful for testing - it tells the LLM to do nothing, and does
not set any fields on the `docs`.
-### Backends {id="backends"}
+### Models {id="models"}
-A _backend_ defines which LLM model to query, and how to query it. It can be a
+A _model_ defines which LLM model to query, and how to query it. It can be a
simple function taking a collection of prompts (consistent with the output type
of `task.generate_prompts()`) and returning a collection of responses
(consistent with the expected input of `parse_responses`). Generally speaking,
@@ -607,156 +753,496 @@ it's a function of type `Callable[[Iterable[Any]], Iterable[Any]]`, but specific
implementations can have other signatures, like
`Callable[[Iterable[str]], Iterable[str]]`.
-All built-in backends are registered in `llm_backends`. If no backend is
-specified, the repo currently connects to the [`OpenAI` API](#openai) by
-default, using the built-in REST protocol, and accesses the `"gpt-3.5-turbo"`
-model.
+All built-in models are registered in `llm_models`. If no model is specified,
+the repo currently connects to the `OpenAI` API by default using REST, and
+accesses the `"gpt-3.5-turbo"` model.
+
+Currently three different approaches to use LLMs are supported:
+
+1. `spacy-llm`s native REST backend. This is the default for all hosted models
+ (e. g. OpenAI, Cohere, Anthropic, ...).
+2. A HuggingFace integration that allows to run a limited set of HF models
+ locally.
+3. A LangChain integration that allows to run any model supported by LangChain
+ (hosted or locally).
+
+Approaches 1. and 2 are the default for hosted model and local models,
+respectively. Alternatively you can use LangChain to access hosted or local
+models by specifying one of the models registered with the `langchain.` prefix.
-_Why are there backends for third-party libraries in addition to a
-native REST backend and which should I choose?_
+_Why LangChain if there are also are a native REST and a HuggingFace backend? When should I use what?_
-Third-party libraries like `langchain` or `minichain` focus on prompt
-management, integration of many different LLM APIs, and other related features
-such as conversational memory or agents. `spacy-llm` on the other hand
-emphasizes features we consider useful in the context of NLP pipelines utilizing
-LLMs to process documents (mostly) independent from each other. It makes sense
-that the feature set of such third-party libraries and `spacy-llm` is not
-identical - and users might want to take advantage of features not available in
-`spacy-llm`.
+Third-party libraries like `langchain` focus on prompt management, integration
+of many different LLM APIs, and other related features such as conversational
+memory or agents. `spacy-llm` on the other hand emphasizes features we consider
+useful in the context of NLP pipelines utilizing LLMs to process documents
+(mostly) independent from each other. It makes sense that the feature sets of
+such third-party libraries and `spacy-llm` aren't identical - and users might
+want to take advantage of features not available in `spacy-llm`.
-The advantage of offering our own REST backend is that we can ensure a larger
-degree of stability of robustness, as we can guarantee backwards-compatibility
-and more smoothly integrated error handling.
+The advantage of implementing our own REST and HuggingFace integrations is that
+we can ensure a larger degree of stability and robustness, as we can guarantee
+backwards-compatibility and more smoothly integrated error handling.
-Ultimately we recommend trying to implement your use case using the REST backend
-first (which is configured as the default backend). If however there are
-features or APIs not covered by `spacy-llm`, it's trivial to switch to the
-backend of a third-party library - and easy to customize the prompting
+If however there are features or APIs not natively covered by `spacy-llm`, it's
+trivial to utilize LangChain to cover this - and easy to customize the prompting
mechanism, if so required.
-#### OpenAI {id="openai"}
+Note that when using hosted services, you have to ensure that the proper API
+keys are set as environment variables as described by the corresponding
+provider's documentation.
-When the backend uses OpenAI, you have to get an API key from openai.com, and
-ensure that the keys are set as environmental variables:
+E. g. when using OpenAI, you have to get an API key from openai.com, and ensure
+that the keys are set as environmental variables:
```shell
export OPENAI_API_KEY="sk-..."
export OPENAI_API_ORG="org-..."
```
-#### spacy.REST.v1 {id="rest-v1"}
+For Cohere it's
-This default backend uses `requests` and a simple retry mechanism to access an
-API.
+```shell
+export CO_API_KEY="..."
+```
+
+and for Anthropic
+
+```shell
+export ANTHROPIC_API_KEY="..."
+```
+
+#### spacy.GPT-4.v1 {id="gpt-4"}
+
+OpenAI's `gpt-4` model family.
+
+> #### Example config:
+>
+> ```ini
+> [components.llm.model]
+> @llm_models = "spacy.GPT-4.v1"
+> name = "gpt-4"
+> config = {"temperature": 0.3}
+> ```
+
+| Argument | Description |
+| ----------- | ----------------------------------------------------------------------------------------------------------------------------------------------------------- |
+| `name` | Model name, i. e. any supported variant for this particular model. Defaults to `"gpt-4"`. ~~Literal["gpt-4", "gpt-4-0314", "gpt-4-32k", "gpt-4-32k-0314"]~~ |
+| `config` | Further configuration passed on to the model. Defaults to `{}`. ~~Dict[Any, Any]~~ |
+| `strict` | If `True`, raises an error if the LLM API returns a malformed response. Otherwise, return the error responses as is. Defaults to `True`. ~~bool~~ |
+| `max_tries` | Max. number of tries for API request. Defaults to `3`. ~~int~~ |
+| `timeout` | Timeout for API request in seconds. Defaults to `30`. ~~int~~ |
+
+#### spacy.GPT-3-5.v1 {id="gpt-3-5"}
+
+OpenAI's `gpt-3-5` model family.
> #### Example config
>
> ```ini
-> [components.llm.backend]
-> @llm_backends = "spacy.REST.v1"
-> api = "OpenAI"
-> config = {"model": "gpt-3.5-turbo", "temperature": 0.3}
+> [components.llm.model]
+> @llm_models = "spacy.GPT-3-5.v1"
+> name = "gpt-3.5-turbo"
+> config = {"temperature": 0.3}
+> ```
+
+| Argument | Description |
+| ----------- | --------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
+| `name` | Model name, i. e. any supported variant for this particular model. Defaults to `"gpt-3.5-turbo"`. ~~Literal["gpt-3.5-turbo", "gpt-3.5-turbo-16k", "gpt-3.5-turbo-0613", "gpt-3.5-turbo-0613-16k"]~~ |
+| `config` | Further configuration passed on to the model. Defaults to `{}`. ~~Dict[Any, Any]~~ |
+| `strict` | If `True`, raises an error if the LLM API returns a malformed response. Otherwise, return the error responses as is. Defaults to `True`. ~~bool~~ |
+| `max_tries` | Max. number of tries for API request. Defaults to `3`. ~~int~~ |
+| `timeout` | Timeout for API request in seconds. Defaults to `30`. ~~int~~ |
+
+#### spacy.Text-Davinci.v1 {id="text-davinci"}
+
+OpenAI's `text-davinci` model family.
+
+> #### Example config
+>
+> ```ini
+> [components.llm.model]
+> @llm_models = "spacy.Text-Davinci.v1"
+> name = "text-davinci-003"
+> config = {"temperature": 0.3}
+> ```
+
+| Argument | Description |
+| ----------- | -------------------------------------------------------------------------------------------------------------------------------------------------------- |
+| `name` | Model name, i. e. any supported variant for this particular model. Defaults to `"text-davinci-003"`. ~~Literal["text-davinci-002", "text-davinci-003"]~~ |
+| `config` | Further configuration passed on to the model. Defaults to `{}`. ~~Dict[Any, Any]~~ |
+| `strict` | If `True`, raises an error if the LLM API returns a malformed response. Otherwise, return the error responses as is. Defaults to `True`. ~~bool~~ |
+| `max_tries` | Max. number of tries for API request. Defaults to `3`. ~~int~~ |
+| `timeout` | Timeout for API request in seconds. Defaults to `30`. ~~int~~ |
+
+#### spacy.Code-Davinci.v1 {id="code-davinci"}
+
+OpenAI's `code-davinci` model family.
+
+> #### Example config
+>
+> ```ini
+> [components.llm.model]
+> @llm_models = "spacy.Code-Davinci.v1"
+> name = "code-davinci-002"
+> config = {"temperature": 0.3}
> ```
| Argument | Description |
| ----------- | ------------------------------------------------------------------------------------------------------------------------------------------------- |
-| `api` | The name of a supported API. In v.0.1.0, only "OpenAI" is supported. ~~str~~ |
-| `config` | Further configuration passed on to the backend. Defaults to `{}`. ~~Dict[Any, Any]~~ |
+| `name` | Model name, i. e. any supported variant for this particular model. Defaults to `"code-davinci-002"`. ~~Literal["code-davinci-002"]~~ |
+| `config` | Further configuration passed on to the model. Defaults to `{}`. ~~Dict[Any, Any]~~ |
| `strict` | If `True`, raises an error if the LLM API returns a malformed response. Otherwise, return the error responses as is. Defaults to `True`. ~~bool~~ |
| `max_tries` | Max. number of tries for API request. Defaults to `3`. ~~int~~ |
| `timeout` | Timeout for API request in seconds. Defaults to `30`. ~~int~~ |
-When `api` is set to `OpenAI`, the following settings can be defined in the
-`config` dictionary:
+#### spacy.Text-Curie.v1 {id="text-curie"}
-- `model`: one of the following list of supported models:
- - `"gpt-4"`
- - `"gpt-4-0314"`
- - `"gpt-4-32k"`
- - `"gpt-4-32k-0314"`
- - `"gpt-3.5-turbo"`
- - `"gpt-3.5-turbo-0301"`
- - `"text-davinci-003"`
- - `"text-davinci-002"`
- - `"text-curie-001"`
- - `"text-babbage-001"`
- - `"text-ada-001"`
- - `"davinci"`
- - `"curie"`
- - `"babbage"`
- - `"ada"`
-- `url`: By default, this is `https://api.openai.com/v1/completions`. For models
- requiring the chat endpoint, use `https://api.openai.com/v1/chat/completions`.
-
-#### spacy.MiniChain.v1 {id="minichain-v1"}
-
-To use [MiniChain](https://github.com/srush/MiniChain) for the API retrieval
-part, make sure you have installed it first:
-
-```shell
-python -m pip install "minichain>=0.3,<0.4"
-# Or install with spacy-llm directly
-python -m pip install "spacy-llm[minichain]"
-```
-
-Note that MiniChain currently only supports Python 3.8, 3.9 and 3.10.
+OpenAI's `text-curie` model family.
> #### Example config
>
> ```ini
-> [components.llm.backend]
-> @llm_backends = "spacy.MiniChain.v1"
-> api = "OpenAI"
->
-> [components.llm.backend.query]
-> @llm_queries = "spacy.RunMiniChain.v1"
-> ```
-
-| Argument | Description |
-| -------- | ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
-| `api` | The name of an API supported by MiniChain, e.g. "OpenAI". ~~str~~ |
-| `config` | Further configuration passed on to the backend. Defaults to `{}`. ~~Dict[Any, Any]~~ |
-| `query` | Function that executes the prompts. If `None`, defaults to `spacy.RunMiniChain.v1`. Defaults to `None`. ~~Optional[Callable[["minichain.backend.Backend", Iterable[str]], Iterable[str]]]~~ |
-
-The default `query` (`spacy.RunMiniChain.v1`) executes the prompts by running
-`model(text).run()` for each given textual prompt.
-
-#### spacy.LangChain.v1 {id="langchain-v1"}
-
-To use [LangChain](https://github.com/hwchase17/langchain) for the API retrieval
-part, make sure you have installed it first:
-
-```shell
-python -m pip install "langchain>=0.0.144,<0.1"
-# Or install with spacy-llm directly
-python -m pip install "spacy-llm[langchain]"
-```
-
-Note that LangChain currently only supports Python 3.9 and beyond.
-
-> #### Example config
->
-> ```ini
-> [components.llm.backend]
-> @llm_backends = "spacy.LangChain.v1"
-> api = "OpenAI"
-> query = {"@llm_queries": "spacy.CallLangChain.v1"}
+> [components.llm.model]
+> @llm_models = "spacy.Text-Curie.v1"
+> name = "text-curie-001"
> config = {"temperature": 0.3}
> ```
-| Argument | Description |
-| -------- | ----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
-| `api` | The name of an API supported by LangChain, e.g. "OpenAI". ~~str~~ |
-| `config` | Further configuration passed on to the backend. Defaults to `{}`. ~~Dict[Any, Any]~~ |
-| `query` | Function that executes the prompts. If `None`, defaults to `spacy.CallLangChain.v1`. Defaults to `None`. ~~Optional[Callable[["langchain.llms.BaseLLM", Iterable[Any]], Iterable[Any]]]~~ |
+| Argument | Description |
+| ----------- | ------------------------------------------------------------------------------------------------------------------------------------------------- |
+| `name` | Model name, i. e. any supported variant for this particular model. Defaults to `"text-curie-001"`. ~~Literal["text-curie-001"]~~ |
+| `config` | Further configuration passed on to the model. Defaults to `{}`. ~~Dict[Any, Any]~~ |
+| `strict` | If `True`, raises an error if the LLM API returns a malformed response. Otherwise, return the error responses as is. Defaults to `True`. ~~bool~~ |
+| `max_tries` | Max. number of tries for API request. Defaults to `3`. ~~int~~ |
+| `timeout` | Timeout for API request in seconds. Defaults to `30`. ~~int~~ |
-The default `query` (`spacy.CallLangChain.v1`) executes the prompts by running
-`model(text)` for each given textual prompt.
+#### spacy.Text-Babbage.v1 {id="text-babbage"}
-#### spacy.Dolly_HF.v1 {id="dollyhf-v1"}
+OpenAI's `text-babbage` model family.
+
+> #### Example config
+>
+> ```ini
+> [components.llm.model]
+> @llm_models = "spacy.Text-Babbage.v1"
+> name = "text-babbage-001"
+> config = {"temperature": 0.3}
+> ```
+
+| Argument | Description |
+| ----------- | ------------------------------------------------------------------------------------------------------------------------------------------------- |
+| `name` | Model name, i. e. any supported variant for this particular model. Defaults to `"text-babbage-001"`. ~~Literal["text-babbage-001"]~~ |
+| `config` | Further configuration passed on to the model. Defaults to `{}`. ~~Dict[Any, Any]~~ |
+| `strict` | If `True`, raises an error if the LLM API returns a malformed response. Otherwise, return the error responses as is. Defaults to `True`. ~~bool~~ |
+| `max_tries` | Max. number of tries for API request. Defaults to `3`. ~~int~~ |
+| `timeout` | Timeout for API request in seconds. Defaults to `30`. ~~int~~ |
+
+#### spacy.Text-Ada.v1 {id="text-ada"}
+
+OpenAI's `text-ada` model family.
+
+> #### Example config
+>
+> ```ini
+> [components.llm.model]
+> @llm_models = "spacy.Text-Ada.v1"
+> name = "text-ada-001"
+> config = {"temperature": 0.3}
+> ```
+
+| Argument | Description |
+| ----------- | ------------------------------------------------------------------------------------------------------------------------------------------------- |
+| `name` | Model name, i. e. any supported variant for this particular model. Defaults to `"text-ada-001"`. ~~Literal["text-ada-001"]~~ |
+| `config` | Further configuration passed on to the model. Defaults to `{}`. ~~Dict[Any, Any]~~ |
+| `strict` | If `True`, raises an error if the LLM API returns a malformed response. Otherwise, return the error responses as is. Defaults to `True`. ~~bool~~ |
+| `max_tries` | Max. number of tries for API request. Defaults to `3`. ~~int~~ |
+| `timeout` | Timeout for API request in seconds. Defaults to `30`. ~~int~~ |
+
+#### spacy.Davinci.v1 {id="davinci"}
+
+OpenAI's `davinci` model family.
+
+> #### Example config
+>
+> ```ini
+> [components.llm.model]
+> @llm_models = "spacy.Davinci.v1 "
+> name = "davinci"
+> config = {"temperature": 0.3}
+> ```
+
+| Argument | Description |
+| ----------- | ------------------------------------------------------------------------------------------------------------------------------------------------- |
+| `name` | Model name, i. e. any supported variant for this particular model. Defaults to `"davinci"`. ~~Literal["davinci"]~~ |
+| `config` | Further configuration passed on to the model. Defaults to `{}`. ~~Dict[Any, Any]~~ |
+| `strict` | If `True`, raises an error if the LLM API returns a malformed response. Otherwise, return the error responses as is. Defaults to `True`. ~~bool~~ |
+| `max_tries` | Max. number of tries for API request. Defaults to `3`. ~~int~~ |
+| `timeout` | Timeout for API request in seconds. Defaults to `30`. ~~int~~ |
+
+#### spacy.Curie.v1 {id="curie"}
+
+OpenAI's `curie` model family.
+
+> #### Example config
+>
+> ```ini
+> [components.llm.model]
+> @llm_models = "spacy.Curie.v1 "
+> name = "curie"
+> config = {"temperature": 0.3}
+> ```
+
+| Argument | Description |
+| ----------- | ------------------------------------------------------------------------------------------------------------------------------------------------- |
+| `name` | Model name, i. e. any supported variant for this particular model. Defaults to `"curie"`. ~~Literal["curie"]~~ |
+| `config` | Further configuration passed on to the model. Defaults to `{}`. ~~Dict[Any, Any]~~ |
+| `strict` | If `True`, raises an error if the LLM API returns a malformed response. Otherwise, return the error responses as is. Defaults to `True`. ~~bool~~ |
+| `max_tries` | Max. number of tries for API request. Defaults to `3`. ~~int~~ |
+| `timeout` | Timeout for API request in seconds. Defaults to `30`. ~~int~~ |
+
+#### spacy.Babbage.v1 {id="babbage"}
+
+OpenAI's `babbage` model family.
+
+> #### Example config
+>
+> ```ini
+> [components.llm.model]
+> @llm_models = "spacy.Babbage.v1 "
+> name = "babbage"
+> config = {"temperature": 0.3}
+> ```
+
+| Argument | Description |
+| ----------- | ------------------------------------------------------------------------------------------------------------------------------------------------- |
+| `name` | Model name, i. e. any supported variant for this particular model. Defaults to `"babbage"`. ~~Literal["babbage"]~~ |
+| `config` | Further configuration passed on to the model. Defaults to `{}`. ~~Dict[Any, Any]~~ |
+| `strict` | If `True`, raises an error if the LLM API returns a malformed response. Otherwise, return the error responses as is. Defaults to `True`. ~~bool~~ |
+| `max_tries` | Max. number of tries for API request. Defaults to `3`. ~~int~~ |
+| `timeout` | Timeout for API request in seconds. Defaults to `30`. ~~int~~ |
+
+#### spacy.Ada.v1 {id="ada"}
+
+OpenAI's `ada` model family.
+
+> #### Example config
+>
+> ```ini
+> [components.llm.model]
+> @llm_models = "spacy.Ada.v1 "
+> name = "ada"
+> config = {"temperature": 0.3}
+> ```
+
+| Argument | Description |
+| ----------- | ------------------------------------------------------------------------------------------------------------------------------------------------- |
+| `name` | Model name, i. e. any supported variant for this particular model. Defaults to `"ada"`. ~~Literal["ada"]~~ |
+| `config` | Further configuration passed on to the model. Defaults to `{}`. ~~Dict[Any, Any]~~ |
+| `strict` | If `True`, raises an error if the LLM API returns a malformed response. Otherwise, return the error responses as is. Defaults to `True`. ~~bool~~ |
+| `max_tries` | Max. number of tries for API request. Defaults to `3`. ~~int~~ |
+| `timeout` | Timeout for API request in seconds. Defaults to `30`. ~~int~~ |
+
+#### spacy.Command.v1 {id="command"}
+
+Cohere's `command` model family.
+
+> #### Example config
+>
+> ```ini
+> [components.llm.model]
+> @llm_models = "spacy.Command.v1 "
+> name = "command"
+> config = {"temperature": 0.3}
+> ```
+
+| Argument | Description |
+| ----------- | ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
+| `name` | Model name, i. e. any supported variant for this particular model. Defaults to `"command"`. ~~Literal["command", "command-light", "command-light-nightly", "command-nightly"]~~ |
+| `config` | Further configuration passed on to the model. Defaults to `{}`. ~~Dict[Any, Any]~~ |
+| `strict` | If `True`, raises an error if the LLM API returns a malformed response. Otherwise, return the error responses as is. Defaults to `True`. ~~bool~~ |
+| `max_tries` | Max. number of tries for API request. Defaults to `3`. ~~int~~ |
+| `timeout` | Timeout for API request in seconds. Defaults to `30`. ~~int~~ |
+
+#### spacy.Claude-1.v1 {id="claude-1"}
+
+Anthropic's `claude-1` model family.
+
+> #### Example config
+>
+> ```ini
+> [components.llm.model]
+> @llm_models = "spacy.Claude-1.v1 "
+> name = "claude-1"
+> config = {"temperature": 0.3}
+> ```
+
+| Argument | Description |
+| ----------- | ------------------------------------------------------------------------------------------------------------------------------------------------- |
+| `name` | Model name, i. e. any supported variant for this particular model. Defaults to `"claude-1"`. ~~Literal["claude-1", "claude-1-100k"]~~ |
+| `config` | Further configuration passed on to the model. Defaults to `{}`. ~~Dict[Any, Any]~~ |
+| `strict` | If `True`, raises an error if the LLM API returns a malformed response. Otherwise, return the error responses as is. Defaults to `True`. ~~bool~~ |
+| `max_tries` | Max. number of tries for API request. Defaults to `3`. ~~int~~ |
+| `timeout` | Timeout for API request in seconds. Defaults to `30`. ~~int~~ |
+
+#### spacy.Claude-instant-1.v1 {id="claude-instant-1"}
+
+Anthropic's `claude-instant-1` model family.
+
+> #### Example config
+>
+> ```ini
+> [components.llm.model]
+> @llm_models = "spacy.Claude-instant-1.v1 "
+> name = "claude-instant-1"
+> config = {"temperature": 0.3}
+> ```
+
+| Argument | Description |
+| ----------- | ------------------------------------------------------------------------------------------------------------------------------------------------------------- |
+| `name` | Model name, i. e. any supported variant for this particular model. Defaults to `"claude-instant-1"`. ~~Literal["claude-instant-1", "claude-instant-1-100k"]~~ |
+| `config` | Further configuration passed on to the model. Defaults to `{}`. ~~Dict[Any, Any]~~ |
+| `strict` | If `True`, raises an error if the LLM API returns a malformed response. Otherwise, return the error responses as is. Defaults to `True`. ~~bool~~ |
+| `max_tries` | Max. number of tries for API request. Defaults to `3`. ~~int~~ |
+| `timeout` | Timeout for API request in seconds. Defaults to `30`. ~~int~~ |
+
+#### spacy.Claude-instant-1-1.v1 {id="claude-instant-1-1"}
+
+Anthropic's `claude-instant-1.1` model family.
+
+> #### Example config
+>
+> ```ini
+> [components.llm.model]
+> @llm_models = "spacy.Claude-instant-1-1.v1 "
+> name = "claude-instant-1.1"
+> config = {"temperature": 0.3}
+> ```
+
+| Argument | Description |
+| ----------- | ------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
+| `name` | Model name, i. e. any supported variant for this particular model. Defaults to `"claude-instant-1.1"`. ~~Literal["claude-instant-1.1", "claude-instant-1.1-100k"]~~ |
+| `config` | Further configuration passed on to the model. Defaults to `{}`. ~~Dict[Any, Any]~~ |
+| `strict` | If `True`, raises an error if the LLM API returns a malformed response. Otherwise, return the error responses as is. Defaults to `True`. ~~bool~~ |
+| `max_tries` | Max. number of tries for API request. Defaults to `3`. ~~int~~ |
+| `timeout` | Timeout for API request in seconds. Defaults to `30`. ~~int~~ |
+
+#### spacy.Claude-1-0.v1 {id="claude-1-0"}
+
+Anthropic's `claude-1.0` model family.
+
+> #### Example config
+>
+> ```ini
+> [components.llm.model]
+> @llm_models = "spacy.Claude-1-0.v1 "
+> name = "claude-1.0"
+> config = {"temperature": 0.3}
+> ```
+
+| Argument | Description |
+| ----------- | ------------------------------------------------------------------------------------------------------------------------------------------------- |
+| `name` | Model name, i. e. any supported variant for this particular model. Defaults to `"claude-1.0"`. ~~Literal["claude-1.0"]~~ |
+| `config` | Further configuration passed on to the model. Defaults to `{}`. ~~Dict[Any, Any]~~ |
+| `strict` | If `True`, raises an error if the LLM API returns a malformed response. Otherwise, return the error responses as is. Defaults to `True`. ~~bool~~ |
+| `max_tries` | Max. number of tries for API request. Defaults to `3`. ~~int~~ |
+| `timeout` | Timeout for API request in seconds. Defaults to `30`. ~~int~~ |
+
+#### spacy.Claude-1-2.v1 {id="claude-1-2"}
+
+Anthropic's `claude-1.2` model family.
+
+> #### Example config
+>
+> ```ini
+> [components.llm.model]
+> @llm_models = "spacy.Claude-1-2.v1 "
+> name = "claude-1.2"
+> config = {"temperature": 0.3}
+> ```
+
+| Argument | Description |
+| ----------- | ------------------------------------------------------------------------------------------------------------------------------------------------- |
+| `name` | Model name, i. e. any supported variant for this particular model. Defaults to `"claude-1.2"`. ~~Literal["claude-1.2"]~~ |
+| `config` | Further configuration passed on to the model. Defaults to `{}`. ~~Dict[Any, Any]~~ |
+| `strict` | If `True`, raises an error if the LLM API returns a malformed response. Otherwise, return the error responses as is. Defaults to `True`. ~~bool~~ |
+| `max_tries` | Max. number of tries for API request. Defaults to `3`. ~~int~~ |
+| `timeout` | Timeout for API request in seconds. Defaults to `30`. ~~int~~ |
+
+#### spacy.Claude-1-3.v1 {id="claude-1-3"}
+
+Anthropic's `claude-1.3` model family.
+
+> #### Example config
+>
+> ```ini
+> [components.llm.model]
+> @llm_models = "spacy.Claude-1-3.v1 "
+> name = "claude-1.3"
+> config = {"temperature": 0.3}
+> ```
+
+| Argument | Description |
+| ----------- | ------------------------------------------------------------------------------------------------------------------------------------------------- |
+| `name` | Model name, i. e. any supported variant for this particular model. Defaults to `"claude-1.3"`. ~~Literal["claude-1.3", "claude-1.3-100k"]~~ |
+| `config` | Further configuration passed on to the model. Defaults to `{}`. ~~Dict[Any, Any]~~ |
+| `strict` | If `True`, raises an error if the LLM API returns a malformed response. Otherwise, return the error responses as is. Defaults to `True`. ~~bool~~ |
+| `max_tries` | Max. number of tries for API request. Defaults to `3`. ~~int~~ |
+| `timeout` | Timeout for API request in seconds. Defaults to `30`. ~~int~~ |
+
+#### spacy.Dolly.v1 {id="dolly"}
+
+To use this model, ideally you have a GPU enabled and have installed
+`transformers`, `torch` and CUDA in your virtual environment. This allows you to
+have the setting `device=cuda:0` in your config, which ensures that the model is
+loaded entirely on the GPU (and fails otherwise).
+
+You can do so with
+
+```shell
+python -m pip install "spacy-llm[transformers]" "transformers[sentencepiece]"
+```
+
+If you don't have access to a GPU, you can install `accelerate` and
+set`device_map=auto` instead, but be aware that this may result in some layers
+getting distributed to the CPU or even the hard drive, which may ultimately
+result in extremely slow queries.
+
+```shell
+python -m pip install "accelerate>=0.16.0,<1.0"
+```
+
+> #### Example config
+>
+> ```ini
+> [components.llm.model]
+> @llm_models = "spacy.Dolly.v1"
+> name = "dolly-v2-3b"
+> ```
+
+| Argument | Description |
+| ------------- | ---------------------------------------------------------------------------------------------------------------------------------------------- |
+| `name` | The name of a Dolly model that is supported (e. g. "dolly-v2-3b" or "dolly-v2-12b"). ~~Literal["dolly-v2-3b", "dolly-v2-7b", "dolly-v2-12b"]~~ |
+| `config_init` | Further configuration passed on to the construction of the model with `transformers.pipeline()`. Defaults to `{}`. ~~Dict[str, Any]~~ |
+| `config_run` | Further configuration used during model inference. Defaults to `{}`. ~~Dict[str, Any]~~ |
+
+Supported models (see the
+[Databricks models page](https://huggingface.co/databricks) on Hugging Face for
+details):
+
+- `"databricks/dolly-v2-3b"`
+- `"databricks/dolly-v2-7b"`
+- `"databricks/dolly-v2-12b"`
+
+Note that Hugging Face will download this model the first time you use it - you
+can
+[define the cached directory](https://huggingface.co/docs/huggingface_hub/main/en/guides/manage-cache)
+by setting the environmental variable `HF_HOME`.
+
+#### spacy.Falcon.v1 {id="falcon"}
To use this backend, ideally you have a GPU enabled and have installed
`transformers`, `torch` and CUDA in your virtual environment. This allows you to
@@ -781,33 +1267,25 @@ python -m pip install "accelerate>=0.16.0,<1.0"
> #### Example config
>
> ```ini
-> [components.llm.backend]
-> @llm_backends = "spacy.Dolly_HF.v1"
-> model = "databricks/dolly-v2-3b"
+> [components.llm.model]
+> @llm_models = "spacy.Falcon.v1"
+> name = "falcon-7b"
> ```
-| Argument | Description |
-| ------------- | ------------------------------------------------------------------------------------------------------------------------------------- |
-| `model` | The name of a Dolly model that is supported. ~~str~~ |
-| `config_init` | Further configuration passed on to the construction of the model with `transformers.pipeline()`. Defaults to `{}`. ~~Dict[str, Any]~~ |
-| `config_run` | Further configuration used during model inference. Defaults to `{}`. ~~Dict[str, Any]~~ |
-
-Supported models (see the
-[Databricks models page](https://huggingface.co/databricks) on Hugging Face for
-details):
-
-- `"databricks/dolly-v2-3b"`
-- `"databricks/dolly-v2-7b"`
-- `"databricks/dolly-v2-12b"`
+| Argument | Description |
+| ------------- | ------------------------------------------------------------------------------------------------------------------------------------------------------------------------ |
+| `name` | The name of a Falcon model variant that is supported. Defaults to `"7b-instruct"`. ~~Literal["falcon-rw-1b", "falcon-7b", "falcon-7b-instruct", "falcon-40b-instruct"]~~ |
+| `config_init` | Further configuration passed on to the construction of the model with `transformers.pipeline()`. Defaults to `{}`. ~~Dict[str, Any]~~ |
+| `config_run` | Further configuration used during model inference. Defaults to `{}`. ~~Dict[str, Any]~~ |
Note that Hugging Face will download this model the first time you use it - you
can
-[define the cached directory](https://huggingface.co/docs/huggingface_hub/main/en/guides/manage-cache)
+[define the cache directory](https://huggingface.co/docs/huggingface_hub/main/en/guides/manage-cache)
by setting the environmental variable `HF_HOME`.
-#### spacy.StableLM_HF.v1 {id="stablelmhf-v1"}
+#### spacy.StableLM.v1 {id="stablelm"}
-To use this backend, ideally you have a GPU enabled and have installed
+To use this model, ideally you have a GPU enabled and have installed
`transformers`, `torch` and CUDA in your virtual environment.
You can do so with
@@ -828,34 +1306,29 @@ python -m pip install "accelerate>=0.16.0,<1.0"
> #### Example config
>
> ```ini
-> [components.llm.backend]
-> @llm_backends = "spacy.StableLM_HF.v1"
-> model = "stabilityai/stablelm-tuned-alpha-7b"
+> [components.llm.model]
+> @llm_models = "spacy.StableLM.v1"
+> name = "stablelm-tuned-alpha-7b"
> ```
-| Argument | Description |
-| ------------- | ----------------------------------------------------------------------------------------------------------------------------------------------------------------- |
-| `model` | The name of a StableLM model that is supported. ~~str~~ |
-| `config_init` | Further configuration passed on to the construction of the model with `transformers.AutoModelForCausalLM.from_pretrained()`. Defaults to `{}`. ~~Dict[str, Any]~~ |
-| `config_run` | Further configuration used during model inference. Defaults to `{}`. ~~Dict[str, Any]~~ |
+| Argument | Description |
+| ------------- | ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
+| `name` | The name of a StableLM model that is supported (e. g. "stablelm-tuned-alpha-7b"). ~~Literal["stablelm-base-alpha-3b", "stablelm-base-alpha-7b", "stablelm-tuned-alpha-3b", "stablelm-tuned-alpha-7b"]~~ |
+| `config_init` | Further configuration passed on to the construction of the model with `transformers.AutoModelForCausalLM.from_pretrained()`. Defaults to `{}`. ~~Dict[str, Any]~~ |
+| `config_run` | Further configuration used during model inference. Defaults to `{}`. ~~Dict[str, Any]~~ |
-Supported models (see the
+See the
[Stability AI StableLM GitHub repo](https://github.com/Stability-AI/StableLM/#stablelm-alpha)
-for details):
-
-- `"stabilityai/stablelm-base-alpha-3b"`
-- `"stabilityai/stablelm-base-alpha-7b"`
-- `"stabilityai/stablelm-tuned-alpha-3b"`
-- `"stabilityai/stablelm-tuned-alpha-7b"`
+for details.
Note that Hugging Face will download this model the first time you use it - you
can
[define the cached directory](https://huggingface.co/docs/huggingface_hub/main/en/guides/manage-cache)
by setting the environmental variable `HF_HOME`.
-#### spacy.OpenLLaMaHF.v1 {id="openllamahf-v1"}
+#### spacy.OpenLLaMA.v1 {id="openllama"}
-To use this backend, ideally you have a GPU enabled and have installed
+To use this model, ideally you have a GPU enabled and have installed
- `transformers[sentencepiece]`
- `torch`
@@ -879,31 +1352,67 @@ python -m pip install "accelerate>=0.16.0,<1.0"
> #### Example config
>
> ```ini
-> [components.llm.backend]
-> @llm_backends = "spacy.OpenLLaMaHF.v1"
-> model = "openlm-research/open_llama_3b_350bt_preview"
+> [components.llm.model]
+> @llm_models = "spacy.OpenLLaMA.v1"
+> name = "open_llama_3b_350bt_preview"
> ```
-| Argument | Description |
-| ------------- | ----------------------------------------------------------------------------------------------------------------------------------------------------------------- |
-| `model` | The name of a OpenLLaMa model that is supported. ~~str~~ |
-| `config_init` | Further configuration passed on to the construction of the model with `transformers.AutoModelForCausalLM.from_pretrained()`. Defaults to `{}`. ~~Dict[str, Any]~~ |
-| `config_run` | Further configuration used during model inference. Defaults to `{}`. ~~Dict[str, Any]~~ |
+| Argument | Description |
+| ------------- | ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------ |
+| `name` | The name of a OpenLLaMA model that is supported (e. g. "open_llama_3b_350bt_preview"). ~~Literal["open_llama_3b_350bt_preview", "open_llama_3b_600bt_preview", "open_llama_7b_400bt_preview", "open_llama_7b_600bt_preview"]~~ |
+| `config_init` | Further configuration passed on to the construction of the model with `transformers.AutoModelForCausalLM.from_pretrained()`. Defaults to `{}`. ~~Dict[str, Any]~~ |
+| `config_run` | Further configuration used during model inference. Defaults to `{}`. ~~Dict[str, Any]~~ |
-Supported models (see the
-[OpenLM Research OpenLLaMa GitHub repo](https://github.com/openlm-research/open_llama)
-for details):
-
-- `"openlm-research/open_llama_3b_350bt_preview"`
-- `"openlm-research/open_llama_3b_600bt_preview"`
-- `"openlm-research/open_llama_7b_400bt_preview"`
-- `"openlm-research/open_llama_7b_700bt_preview"`
+See the
+[OpenLM Research OpenLLaMA GitHub repo](https://github.com/openlm-research/open_llama)
+for details.
Note that Hugging Face will download this model the first time you use it - you
can
[define the cached directory](https://huggingface.co/docs/huggingface_hub/main/en/guides/manage-cache)
by setting the environmental variable `HF_HOME`.
+#### LangChain models {id="langchain-models"}
+
+To use [LangChain](https://github.com/hwchase17/langchain) for the API retrieval
+part, make sure you have installed it first:
+
+```shell
+python -m pip install "langchain==0.0.191"
+# Or install with spacy-llm directly
+python -m pip install "spacy-llm[extras]"
+```
+
+Note that LangChain currently only supports Python 3.9 and beyond.
+
+LangChain models in `spacy-llm` work slightly differently. `langchain`'s models
+are parsed automatically, each LLM class in `langchain` has one entry in
+`spacy-llm`'s registry. As `langchain`'s design has one class per API and not
+per model, this results in registry entries like `langchain.OpenAI.v1` - i. e.
+there is one registry entry per API and not per model (family), as for the REST-
+and HuggingFace-based entries.
+
+The name of the model to be used has to be passed in via the `name` attribute.
+
+> #### Example config
+>
+> ```ini
+> [components.llm.model]
+> @llm_models = "langchain.OpenAI.v1"
+> name = "gpt-3.5-turbo"
+> query = {"@llm_queries": "spacy.CallLangChain.v1"}
+> config = {"temperature": 0.3}
+> ```
+
+| Argument | Description |
+| -------- | --------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
+| `name` | The name of a mdodel supported by LangChain for this API. ~~str~~ |
+| `config` | Configuration passed on to the LangChain model. Defaults to `{}`. ~~Dict[Any, Any]~~ |
+| `query` | Function that executes the prompts. If `None`, defaults to `spacy.CallLangChain.v1`. ~~Optional[Callable[["langchain.llms.BaseLLM", Iterable[Any]], Iterable[Any]]]~~ |
+
+The default `query` (`spacy.CallLangChain.v1`) executes the prompts by running
+`model(text)` for each given textual prompt.
+
### Cache {id="cache"}
Interacting with LLMs, either through an external API or a local instance, is
diff --git a/website/docs/usage/large-language-models.mdx b/website/docs/usage/large-language-models.mdx
index ff99fff20..bdf37fa06 100644
--- a/website/docs/usage/large-language-models.mdx
+++ b/website/docs/usage/large-language-models.mdx
@@ -21,10 +21,9 @@ required.
- Serializable `llm` **component** to integrate prompts into your pipeline
- **Modular functions** to define the [**task**](#tasks) (prompting and parsing)
- and [**backend**](#backends) (model to use)
+ and [**model**](#models) (model to use)
- Support for **hosted APIs** and self-hosted **open-source models**
-- Integration with [`MiniChain`](https://github.com/srush/MiniChain) and
- [`LangChain`](https://github.com/hwchase17/langchain)
+- Integration with [`LangChain`](https://github.com/hwchase17/langchain)
- Access to
**[OpenAI API](https://platform.openai.com/docs/api-reference/introduction)**,
including GPT-4 and various GPT-3 models
@@ -85,9 +84,9 @@ python -m pip install spacy-llm
## Usage {id="usage"}
-The task and the backend have to be supplied to the `llm` pipeline component
-using [spaCy's config system](https://spacy.io/api/data-formats#config). This
-package provides various built-in functionality, as detailed in the [API](#-api)
+The task and the model have to be supplied to the `llm` pipeline component using
+[spaCy's config system](https://spacy.io/api/data-formats#config). This package
+provides various built-in functionality, as detailed in the [API](#-api)
documentation.
### Example 1: Add a text classifier using a GPT-3 model from OpenAI {id="example-1"}
@@ -114,10 +113,9 @@ factory = "llm"
@llm_tasks = "spacy.TextCat.v2"
labels = ["COMPLIMENT", "INSULT"]
-[components.llm.backend]
-@llm_backends = "spacy.REST.v1"
-api = "OpenAI"
-config = {"model": "gpt-3.5-turbo", "temperature": 0.3}
+[components.llm.model]
+@llm_models = "spacy.GPT-3-5.v1"
+config = {"temperature": 0.3}
```
Now run:
@@ -153,10 +151,10 @@ factory = "llm"
@llm_tasks = "spacy.NER.v2"
labels = ["PERSON", "ORGANISATION", "LOCATION"]
-[components.llm.backend]
-@llm_backends = "spacy.Dolly_HF.v1"
-# For better performance, use databricks/dolly-v2-12b instead
-model = "databricks/dolly-v2-3b"
+[components.llm.model]
+@llm_models = "spacy.Dolly.v1"
+# For better performance, use dolly-v2-12b instead
+name = "dolly-v2-3b"
```
Now run:
@@ -191,10 +189,8 @@ nlp.add_pipe(
"@llm_tasks": "spacy.NER.v2",
"labels": ["PERSON", "ORGANISATION", "LOCATION"]
},
- "backend": {
- "@llm_backends": "spacy.REST.v1",
- "api": "OpenAI",
- "config": {"model": "gpt-3.5-turbo"},
+ "model": {
+ "@llm_models": "spacy.gpt-3.5.v1",
},
},
)
@@ -312,7 +308,7 @@ Text:
You look gorgeous!
'''
-Backend response for doc: You look gorgeous!
+Model response for doc: You look gorgeous!
COMPLIMENT
```
@@ -324,8 +320,8 @@ COMPLIMENT
## API {id="api"}
-`spacy-llm` exposes a `llm` factory with [configurable settings](api/large-language-models#config).
-
+`spacy-llm` exposes a `llm` factory with
+[configurable settings](api/large-language-models#config).
An `llm` component is defined by two main settings:
@@ -372,7 +368,8 @@ method is defined, `spacy-llm` will call it to evaluate the component.
| --------------------------------------------------------------------------- | --------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
| [`task.generate_prompts`](/api/large-language-models#task-generate-prompts) | Takes a collection of documents, and returns a collection of "prompts", which can be of type `Any`. |
| [`task.parse_responses`](/api/large-language-models#task-parse-responses) | Takes a collection of LLM responses and the original documents, parses the responses into structured information, and sets the annotations on the documents. |
-| [`spacy.NER.v2`](/api/large-language-models#ner-v2) | The built-in NER task supports both zero-shot and few-shot prompting. |
+| [`spacy.Summarization.v1`](/api/large-language-models#summarization-v1) | The summarization task prompts the model for a concise summary of the provided text. |
+| [`spacy.NER.v2`](/api/large-language-models#ner-v2) | The built-in NER task supports both zero-shot and few-shot prompting. This version also supports explicitly defining the provided labels with custom descriptions. |
| [`spacy.NER.v1`](/api/large-language-models#ner-v1) | The original version of the built-in NER task supports both zero-shot and few-shot prompting. |
| [`spacy.SpanCat.v2`](/api/large-language-models#spancat-v2) | The built-in SpanCat task is a simple adaptation of the NER task to support overlapping entities and store its annotations in `doc.spans`. |
| [`spacy.SpanCat.v1`](/api/large-language-models#spancat-v1) | The original version of the built-in SpanCat task is a simple adaptation of the v1 NER task to support overlapping entities and store its annotations in `doc.spans`. |
@@ -381,11 +378,12 @@ method is defined, `spacy-llm` will call it to evaluate the component.
| [`spacy.TextCat.v1`](/api/large-language-models#textcat-v1) | Version 1 of the built-in TextCat task supports both zero-shot and few-shot prompting. |
| [`spacy.REL.v1`](/api/large-language-models#rel-v1) | The built-in REL task supports both zero-shot and few-shot prompting. It relies on an upstream NER component for entities extraction. |
| [`spacy.Lemma.v1`](/api/large-language-models#lemma-v1) | The `Lemma.v1` task lemmatizes the provided text and updates the `lemma_` attribute in the doc's tokens accordingly. |
+| [`spacy.Sentiment.v1`](/api/large-language-models#sentiment-v1) | Performs sentiment analysis on provided texts. |
| [`spacy.NoOp.v1`](/api/large-language-models#noop-v1) | This task is only useful for testing - it tells the LLM to do nothing, and does not set any fields on the `docs`. |
-### Backends {id="backends"}
+### Model {id="models"}
-A _backend_ defines which LLM model to query, and how to query it. It can be a
+A _model_ defines which LLM model to query, and how to query it. It can be a
simple function taking a collection of prompts (consistent with the output type
of `task.generate_prompts()`) and returning a collection of responses
(consistent with the expected input of `parse_responses`). Generally speaking,
@@ -393,52 +391,101 @@ it's a function of type `Callable[[Iterable[Any]], Iterable[Any]]`, but specific
implementations can have other signatures, like
`Callable[[Iterable[str]], Iterable[str]]`.
-All built-in backends are registered in `llm_backends`. If no backend is
-specified, the repo currently connects to the [`OpenAI` API](#openai) by
-default, using the built-in REST protocol, and accesses the `"gpt-3.5-turbo"`
-model.
+All built-in models are registered in `llm_models`. If no model is specified,
+the repo currently connects to the `OpenAI` API by default using REST, and
+accesses the `"gpt-3.5-turbo"` model.
+
+Currently three different approaches to use LLMs are supported:
+
+1. `spacy-llm`s native REST backend. This is the default for all hosted models
+ (e. g. OpenAI, Cohere, Anthropic, ...).
+2. A HuggingFace integration that allows to run a limited set of HF models
+ locally.
+3. A LangChain integration that allows to run any model supported by LangChain
+ (hosted or locally).
+
+Approaches 1. and 2 are the default for hosted model and local models,
+respectively. Alternatively you can use LangChain to access hosted or local
+models by specifying one of the models registered with the `langchain.` prefix.
-_Why are there backends for third-party libraries in addition to a
-native REST backend and which should I choose?_
+_Why LangChain if there are also are a native REST and a HuggingFace backend? When should I use what?_
-Third-party libraries like `langchain` or `minichain` focus on prompt
-management, integration of many different LLM APIs, and other related features
-such as conversational memory or agents. `spacy-llm` on the other hand
-emphasizes features we consider useful in the context of NLP pipelines utilizing
-LLMs to process documents (mostly) independent from each other. It makes sense
-that the feature set of such third-party libraries and `spacy-llm` is not
-identical - and users might want to take advantage of features not available in
-`spacy-llm`.
+Third-party libraries like `langchain` focus on prompt management, integration
+of many different LLM APIs, and other related features such as conversational
+memory or agents. `spacy-llm` on the other hand emphasizes features we consider
+useful in the context of NLP pipelines utilizing LLMs to process documents
+(mostly) independent from each other. It makes sense that the feature sets of
+such third-party libraries and `spacy-llm` aren't identical - and users might
+want to take advantage of features not available in `spacy-llm`.
-The advantage of offering our own REST backend is that we can ensure a larger
-degree of stability of robustness, as we can guarantee backwards-compatibility
-and more smoothly integrated error handling.
+The advantage of implementing our own REST and HuggingFace integrations is that
+we can ensure a larger degree of stability and robustness, as we can guarantee
+backwards-compatibility and more smoothly integrated error handling.
-Ultimately we recommend trying to implement your use case using the REST backend
-first (which is configured as the default backend). If however there are
-features or APIs not covered by `spacy-llm`, it's trivial to switch to the
-backend of a third-party library - and easy to customize the prompting
+If however there are features or APIs not natively covered by `spacy-llm`, it's
+trivial to utilize LangChain to cover this - and easy to customize the prompting
mechanism, if so required.
-| Component | Description |
-| ------------------------------------------------------------------- | ----------------------------------------------------------------------------------- |
-| [`OpenAI`](/api/large-language-models#openai) | ?? |
-| [`spacy.REST.v1`](/api/large-language-models#rest-v1) | This default backend uses `requests` and a simple retry mechanism to access an API. |
-| [`spacy.MiniChain.v1`](/api/large-language-models#minichain-v1) | Use [MiniChain](https://github.com/srush/MiniChain) for the API retrieval. |
-| [`spacy.LangChain.v1`](/api/large-language-models#langchain-v1) | Use [LangChain](https://github.com/hwchase17/langchain) for the API retrieval. |
-| [`spacy.Dolly_HF.v1`](/api/large-language-models#dollyhf-v1) | Use [LangChain](https://github.com/hwchase17/langchain) for the API retrieval. |
-| [`spacy.StableLM_HF.v1`](/api/large-language-models#stablelmhf-v1) | Use [LangChain](https://github.com/hwchase17/langchain) for the API retrieval. |
-| [`spacy.OpenLLaMaHF.v1`](/api/large-language-models#openllamahf-v1) | Use [LangChain](https://github.com/hwchase17/langchain) for the API retrieval. |
+Note that when using hosted services, you have to ensure that the proper API
+keys are set as environment variables as described by the corresponding
+provider's documentation.
+
+E. g. when using OpenAI, you have to get an API key from openai.com, and ensure
+that the keys are set as environmental variables:
+
+```shell
+export OPENAI_API_KEY="sk-..."
+export OPENAI_API_ORG="org-..."
+```
+
+For Cohere it's
+
+```shell
+export CO_API_KEY="..."
+```
+
+and for Anthropic
+
+```shell
+export ANTHROPIC_API_KEY="..."
+```
+
+| Component | Description |
+| ------------------------------------------------------------------------------ | ------------------------------------------------------------------------------------ |
+| [`spacy.GPT-4.v1`](/api/large-language-models#gpt-4) | OpenAI’s `gpt-4` model family. |
+| [`spacy.GPT-3-5.v1`](/api/large-language-models#gpt-3-5) | OpenAI’s `gpt-3-5` model family. |
+| [`spacy.Text-Davinci.v1`](/api/large-language-models#text-davinci) | OpenAI’s `text-davinci` model family. |
+| [`spacy.Code-Davinci.v1`](/api/large-language-models#code-davinci) | OpenAI’s `code-davinci` model family. |
+| [`spacy.Text-Curie.v1`](/api/large-language-models#text-curie) | OpenAI’s `text-curie` model family. |
+| [`spacy.Text-Babbage.v1`](/api/large-language-models#text-babbage) | OpenAI’s `text-babbage` model family. |
+| [`spacy.Text-Ada.v1`](/api/large-language-models#text-ada) | OpenAI’s `text-ada` model family. |
+| [`spacy.Davinci.v1`](/api/large-language-models#davinci) | OpenAI’s `davinci` model family. |
+| [`spacy.Curie.v1`](/api/large-language-models#curie) | OpenAI’s `curie` model family. |
+| [`spacy.Babbage.v1`](/api/large-language-models#babbage) | OpenAI’s `babbage` model family. |
+| [`spacy.Ada.v1`](/api/large-language-models#ada) | OpenAI’s `ada` model family. |
+| [`spacy.Command.v1`](/api/large-language-models#command) | Cohere’s `command` model family. |
+| [`spacy.Claude-1.v1`](/api/large-language-models#claude-1) | Anthropic’s `claude-1` model family. |
+| [`spacy.Claude-instant-1.v1`](/api/large-language-models#claude-instant-1) | Anthropic’s `claude-instant-1` model family. |
+| [`spacy.Claude-instant-1-1.v1`](/api/large-language-models#claude-instant-1-1) | Anthropic’s `claude-instant-1.1` model family. |
+| [`spacy.Claude-1-0.v1`](/api/large-language-models#claude-1-0) | Anthropic’s `claude-1.0` model family. |
+| [`spacy.Claude-1-2.v1`](/api/large-language-models#claude-1-2) | Anthropic’s `claude-1.2` model family. |
+| [`spacy.Claude-1-3.v1`](/api/large-language-models#claude-1-3) | Anthropic’s `claude-1.3` model family. |
+| [`spacy.Dolly.v1`](/api/large-language-models#dolly) | Dolly models through [Databricks](https://huggingface.co/databricks) on HuggingFace. |
+| [`spacy.Falcon.v1`](/api/large-language-models#falcon) | Falcon model through HuggingFace. |
+| [`spacy.StableLM.v1`](/api/large-language-models#stablelm) | StableLM model through HuggingFace. |
+| [`spacy.OpenLLaMA.v1`](/api/large-language-models#openllama) | OpenLLaMA model through HuggingFace. |
+| [LangChain models](/api/large-language-models#langchain-models) | LangChain models for API retrieval. |
### Cache {id="cache"}
Interacting with LLMs, either through an external API or a local instance, is
costly. Since developing an NLP pipeline generally means a lot of exploration
-and prototyping, `spacy-llm` implements a built-in [cache](/api/large-language-models#cache) to avoid reprocessing
-the same documents at each run that keeps batches of documents stored on disk.
+and prototyping, `spacy-llm` implements a built-in
+[cache](/api/large-language-models#cache) to avoid reprocessing the same
+documents at each run that keeps batches of documents stored on disk.
### Various functions {id="various-functions"}