update for v0.4.0

2025-08-05 04:40:20 +03:00 · 2023-07-10 10:25:21 +02:00 · 2023-07-10 10:25:21 +02:00 · 682dec77cf
commit 682dec77cf
parent b552b4f60a
2 changed files with 846 additions and 290 deletions
--- a/website/docs/api/large-language-models.mdx
+++ b/website/docs/api/large-language-models.mdx
--- a/website/docs/usage/large-language-models.mdx
+++ b/website/docs/usage/large-language-models.mdx
@ -21,10 +21,9 @@ required.

 - Serializable `llm` **component** to integrate prompts into your pipeline
 - **Modular functions** to define the [**task**](#tasks) (prompting and parsing)
-  and [**backend**](#backends) (model to use)
+  and [**model**](#models) (model to use)
 - Support for **hosted APIs** and self-hosted **open-source models**
- Integration with [`MiniChain`](https://github.com/srush/MiniChain) and
-  [`LangChain`](https://github.com/hwchase17/langchain)
+- Integration with [`LangChain`](https://github.com/hwchase17/langchain)
 - Access to
  **[OpenAI API](https://platform.openai.com/docs/api-reference/introduction)**,
  including GPT-4 and various GPT-3 models
@ -85,9 +84,9 @@ python -m pip install spacy-llm

 ## Usage {id="usage"}

-The task and the backend have to be supplied to the `llm` pipeline component
-using [spaCy's config system](https://spacy.io/api/data-formats#config). This
-package provides various built-in functionality, as detailed in the [API](#-api)
+The task and the model have to be supplied to the `llm` pipeline component using
+[spaCy's config system](https://spacy.io/api/data-formats#config). This package
+provides various built-in functionality, as detailed in the [API](#-api)
 documentation.

 ### Example 1: Add a text classifier using a GPT-3 model from OpenAI {id="example-1"}
@ -114,10 +113,9 @@ factory = "llm"
@llm_tasks = "spacy.TextCat.v2"
 labels = ["COMPLIMENT", "INSULT"]

-[components.llm.backend]
-@llm_backends = "spacy.REST.v1"
-api = "OpenAI"
-config = {"model": "gpt-3.5-turbo", "temperature": 0.3}
+[components.llm.model]
+@llm_models = "spacy.GPT-3-5.v1"
+config = {"temperature": 0.3}
 ```

 Now run:
@ -153,10 +151,10 @@ factory = "llm"
@llm_tasks = "spacy.NER.v2"
 labels = ["PERSON", "ORGANISATION", "LOCATION"]

-[components.llm.backend]
-@llm_backends = "spacy.Dolly_HF.v1"
-# For better performance, use databricks/dolly-v2-12b instead
-model = "databricks/dolly-v2-3b"
+[components.llm.model]
+@llm_models = "spacy.Dolly.v1"
+# For better performance, use dolly-v2-12b instead
+name = "dolly-v2-3b"
 ```

 Now run:
@ -191,10 +189,8 @@ nlp.add_pipe(
            "@llm_tasks": "spacy.NER.v2",
            "labels": ["PERSON", "ORGANISATION", "LOCATION"]
        },
-        "backend": {
-            "@llm_backends": "spacy.REST.v1",
-            "api": "OpenAI",
-            "config": {"model": "gpt-3.5-turbo"},
+        "model": {
+            "@llm_models": "spacy.gpt-3.5.v1",
        },
    },
 )
@ -312,7 +308,7 @@ Text:
 You look gorgeous!
 '''

-Backend response for doc: You look gorgeous!
+Model response for doc: You look gorgeous!
 COMPLIMENT
 ```

@ -324,8 +320,8 @@ COMPLIMENT

 ## API {id="api"}

-`spacy-llm` exposes a `llm` factory with [configurable settings](api/large-language-models#config). 
-
+`spacy-llm` exposes a `llm` factory with
+[configurable settings](api/large-language-models#config).

 An `llm` component is defined by two main settings:

@ -372,7 +368,8 @@ method is defined, `spacy-llm` will call it to evaluate the component.
 | --------------------------------------------------------------------------- | --------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
 | [`task.generate_prompts`](/api/large-language-models#task-generate-prompts) | Takes a collection of documents, and returns a collection of "prompts", which can be of type `Any`.                                                                   |
 | [`task.parse_responses`](/api/large-language-models#task-parse-responses)   | Takes a collection of LLM responses and the original documents, parses the responses into structured information, and sets the annotations on the documents.          |
-| [`spacy.NER.v2`](/api/large-language-models#ner-v2)                         | The built-in NER task supports both zero-shot and few-shot prompting.                                                                                                 |
+| [`spacy.Summarization.v1`](/api/large-language-models#summarization-v1)     | The summarization task prompts the model for a concise summary of the provided text.                                                                                  |
+| [`spacy.NER.v2`](/api/large-language-models#ner-v2)                         | The built-in NER task supports both zero-shot and few-shot prompting. This version also supports explicitly defining the provided labels with custom descriptions.    |
 | [`spacy.NER.v1`](/api/large-language-models#ner-v1)                         | The original version of the built-in NER task supports both zero-shot and few-shot prompting.                                                                         |
 | [`spacy.SpanCat.v2`](/api/large-language-models#spancat-v2)                 | The built-in SpanCat task is a simple adaptation of the NER task to support overlapping entities and store its annotations in `doc.spans`.                            |
 | [`spacy.SpanCat.v1`](/api/large-language-models#spancat-v1)                 | The original version of the built-in SpanCat task is a simple adaptation of the v1 NER task to support overlapping entities and store its annotations in `doc.spans`. |
@ -381,11 +378,12 @@ method is defined, `spacy-llm` will call it to evaluate the component.
 | [`spacy.TextCat.v1`](/api/large-language-models#textcat-v1)                 | Version 1 of the built-in TextCat task supports both zero-shot and few-shot prompting.                                                                                |
 | [`spacy.REL.v1`](/api/large-language-models#rel-v1)                         | The built-in REL task supports both zero-shot and few-shot prompting. It relies on an upstream NER component for entities extraction.                                 |
 | [`spacy.Lemma.v1`](/api/large-language-models#lemma-v1)                     | The `Lemma.v1` task lemmatizes the provided text and updates the `lemma_` attribute in the doc's tokens accordingly.                                                  |
+| [`spacy.Sentiment.v1`](/api/large-language-models#sentiment-v1)             | Performs sentiment analysis on provided texts.                                                                                                                        |
 | [`spacy.NoOp.v1`](/api/large-language-models#noop-v1)                       | This task is only useful for testing - it tells the LLM to do nothing, and does not set any fields on the `docs`.                                                     |

-### Backends {id="backends"}
+### Model {id="models"}

-A _backend_ defines which LLM model to query, and how to query it. It can be a
+A _model_ defines which LLM model to query, and how to query it. It can be a
 simple function taking a collection of prompts (consistent with the output type
 of `task.generate_prompts()`) and returning a collection of responses
 (consistent with the expected input of `parse_responses`). Generally speaking,
@ -393,52 +391,101 @@ it's a function of type `Callable[[Iterable[Any]], Iterable[Any]]`, but specific
 implementations can have other signatures, like
 `Callable[[Iterable[str]], Iterable[str]]`.

-All built-in backends are registered in `llm_backends`. If no backend is
-specified, the repo currently connects to the [`OpenAI` API](#openai) by
-default, using the built-in REST protocol, and accesses the `"gpt-3.5-turbo"`
-model.
+All built-in models are registered in `llm_models`. If no model is specified,
+the repo currently connects to the `OpenAI` API by default using REST, and
+accesses the `"gpt-3.5-turbo"` model.
+
+Currently three different approaches to use LLMs are supported:
+
+1. `spacy-llm`s native REST backend. This is the default for all hosted models
+   (e. g. OpenAI, Cohere, Anthropic, ...).
+2. A HuggingFace integration that allows to run a limited set of HF models
+   locally.
+3. A LangChain integration that allows to run any model supported by LangChain
+   (hosted or locally).
+
+Approaches 1. and 2 are the default for hosted model and local models,
+respectively. Alternatively you can use LangChain to access hosted or local
+models by specifying one of the models registered with the `langchain.` prefix.

 <Infobox>
-_Why are there backends for third-party libraries in addition to a
-native REST backend and which should I choose?_
+_Why LangChain if there are also are a native REST and a HuggingFace backend? When should I use what?_

-Third-party libraries like `langchain` or `minichain` focus on prompt
-management, integration of many different LLM APIs, and other related features
-such as conversational memory or agents. `spacy-llm` on the other hand
-emphasizes features we consider useful in the context of NLP pipelines utilizing
-LLMs to process documents (mostly) independent from each other. It makes sense
-that the feature set of such third-party libraries and `spacy-llm` is not
-identical - and users might want to take advantage of features not available in
-`spacy-llm`.
+Third-party libraries like `langchain` focus on prompt management, integration
+of many different LLM APIs, and other related features such as conversational
+memory or agents. `spacy-llm` on the other hand emphasizes features we consider
+useful in the context of NLP pipelines utilizing LLMs to process documents
+(mostly) independent from each other. It makes sense that the feature sets of
+such third-party libraries and `spacy-llm` aren't identical - and users might
+want to take advantage of features not available in `spacy-llm`.

-The advantage of offering our own REST backend is that we can ensure a larger
-degree of stability of robustness, as we can guarantee backwards-compatibility
-and more smoothly integrated error handling.
+The advantage of implementing our own REST and HuggingFace integrations is that
+we can ensure a larger degree of stability and robustness, as we can guarantee
+backwards-compatibility and more smoothly integrated error handling.

-Ultimately we recommend trying to implement your use case using the REST backend
-first (which is configured as the default backend). If however there are
-features or APIs not covered by `spacy-llm`, it's trivial to switch to the
-backend of a third-party library - and easy to customize the prompting
+If however there are features or APIs not natively covered by `spacy-llm`, it's
+trivial to utilize LangChain to cover this - and easy to customize the prompting
 mechanism, if so required.

 </Infobox>

-| Component                                                           | Description                                                                         |
-| ------------------------------------------------------------------- | ----------------------------------------------------------------------------------- |
-| [`OpenAI`](/api/large-language-models#openai)                       | ??                                                                                  |
-| [`spacy.REST.v1`](/api/large-language-models#rest-v1)               | This default backend uses `requests` and a simple retry mechanism to access an API. |
-| [`spacy.MiniChain.v1`](/api/large-language-models#minichain-v1)     | Use [MiniChain](https://github.com/srush/MiniChain) for the API retrieval.          |
-| [`spacy.LangChain.v1`](/api/large-language-models#langchain-v1)     | Use [LangChain](https://github.com/hwchase17/langchain) for the API retrieval.      |
-| [`spacy.Dolly_HF.v1`](/api/large-language-models#dollyhf-v1)        | Use [LangChain](https://github.com/hwchase17/langchain) for the API retrieval.      |
-| [`spacy.StableLM_HF.v1`](/api/large-language-models#stablelmhf-v1)  | Use [LangChain](https://github.com/hwchase17/langchain) for the API retrieval.      |
-| [`spacy.OpenLLaMaHF.v1`](/api/large-language-models#openllamahf-v1) | Use [LangChain](https://github.com/hwchase17/langchain) for the API retrieval.      |
+Note that when using hosted services, you have to ensure that the proper API
+keys are set as environment variables as described by the corresponding
+provider's documentation.
+
+E. g. when using OpenAI, you have to get an API key from openai.com, and ensure
+that the keys are set as environmental variables:
+
+```shell
+export OPENAI_API_KEY="sk-..."
+export OPENAI_API_ORG="org-..."
+```
+
+For Cohere it's
+
+```shell
+export CO_API_KEY="..."
+```
+
+and for Anthropic
+
+```shell
+export ANTHROPIC_API_KEY="..."
+```
+
+| Component                                                                      | Description                                                                          |
+| ------------------------------------------------------------------------------ | ------------------------------------------------------------------------------------ |
+| [`spacy.GPT-4.v1`](/api/large-language-models#gpt-4)                           | OpenAI’s `gpt-4` model family.                                                       |
+| [`spacy.GPT-3-5.v1`](/api/large-language-models#gpt-3-5)                       | OpenAI’s `gpt-3-5` model family.                                                     |
+| [`spacy.Text-Davinci.v1`](/api/large-language-models#text-davinci)             | OpenAI’s `text-davinci` model family.                                                |
+| [`spacy.Code-Davinci.v1`](/api/large-language-models#code-davinci)             | OpenAI’s `code-davinci` model family.                                                |
+| [`spacy.Text-Curie.v1`](/api/large-language-models#text-curie)                 | OpenAI’s `text-curie` model family.                                                  |
+| [`spacy.Text-Babbage.v1`](/api/large-language-models#text-babbage)             | OpenAI’s `text-babbage` model family.                                                |
+| [`spacy.Text-Ada.v1`](/api/large-language-models#text-ada)                     | OpenAI’s `text-ada` model family.                                                    |
+| [`spacy.Davinci.v1`](/api/large-language-models#davinci)                       | OpenAI’s `davinci` model family.                                                     |
+| [`spacy.Curie.v1`](/api/large-language-models#curie)                           | OpenAI’s `curie` model family.                                                       |
+| [`spacy.Babbage.v1`](/api/large-language-models#babbage)                       | OpenAI’s `babbage` model family.                                                     |
+| [`spacy.Ada.v1`](/api/large-language-models#ada)                               | OpenAI’s `ada` model family.                                                         |
+| [`spacy.Command.v1`](/api/large-language-models#command)                       | Cohere’s `command` model family.                                                     |
+| [`spacy.Claude-1.v1`](/api/large-language-models#claude-1)                     | Anthropic’s `claude-1` model family.                                                 |
+| [`spacy.Claude-instant-1.v1`](/api/large-language-models#claude-instant-1)     | Anthropic’s `claude-instant-1` model family.                                         |
+| [`spacy.Claude-instant-1-1.v1`](/api/large-language-models#claude-instant-1-1) | Anthropic’s `claude-instant-1.1` model family.                                       |
+| [`spacy.Claude-1-0.v1`](/api/large-language-models#claude-1-0)                 | Anthropic’s `claude-1.0` model family.                                               |
+| [`spacy.Claude-1-2.v1`](/api/large-language-models#claude-1-2)                 | Anthropic’s `claude-1.2` model family.                                               |
+| [`spacy.Claude-1-3.v1`](/api/large-language-models#claude-1-3)                 | Anthropic’s `claude-1.3` model family.                                               |
+| [`spacy.Dolly.v1`](/api/large-language-models#dolly)                           | Dolly models through [Databricks](https://huggingface.co/databricks) on HuggingFace. |
+| [`spacy.Falcon.v1`](/api/large-language-models#falcon)                         | Falcon model through HuggingFace.                                                    |
+| [`spacy.StableLM.v1`](/api/large-language-models#stablelm)                     | StableLM model through HuggingFace.                                                  |
+| [`spacy.OpenLLaMA.v1`](/api/large-language-models#openllama)                   | OpenLLaMA model through HuggingFace.                                                 |
+| [LangChain models](/api/large-language-models#langchain-models)                | LangChain models for API retrieval.                                                  |

 ### Cache {id="cache"}

 Interacting with LLMs, either through an external API or a local instance, is
 costly. Since developing an NLP pipeline generally means a lot of exploration
-and prototyping, `spacy-llm` implements a built-in [cache](/api/large-language-models#cache) to avoid reprocessing
-the same documents at each run that keeps batches of documents stored on disk.
+and prototyping, `spacy-llm` implements a built-in
+[cache](/api/large-language-models#cache) to avoid reprocessing the same
+documents at each run that keeps batches of documents stored on disk.

 ### Various functions {id="various-functions"}