update for v0.4.0

2025-08-05 21:00:19 +03:00 · 2023-07-10 10:25:21 +02:00 · 2023-07-10 10:25:21 +02:00 · 682dec77cf
commit 682dec77cf
parent b552b4f60a
2 changed files with 846 additions and 290 deletions
--- a/website/docs/api/large-language-models.mdx
+++ b/website/docs/api/large-language-models.mdx
--- a/website/docs/usage/large-language-models.mdx
+++ b/website/docs/usage/large-language-models.mdx
@ -21,10 +21,9 @@ required.
 - Serializable `llm` **component** to integrate prompts into your pipeline
 - **Modular functions** to define the [**task**](#tasks) (prompting and parsing)
-  and [**backend**](#backends) (model to use)
+  and [**model**](#models) (model to use)
 - Support for **hosted APIs** and self-hosted **open-source models**
- Integration with [`MiniChain`](https://github.com/srush/MiniChain) and
+- Integration with [`LangChain`](https://github.com/hwchase17/langchain)
  [`LangChain`](https://github.com/hwchase17/langchain)
 - Access to
  **[OpenAI API](https://platform.openai.com/docs/api-reference/introduction)**,
  including GPT-4 and various GPT-3 models
@ -85,9 +84,9 @@ python -m pip install spacy-llm
 ## Usage {id="usage"}
-The task and the backend have to be supplied to the `llm` pipeline component
+The task and the model have to be supplied to the `llm` pipeline component using
-using [spaCy's config system](https://spacy.io/api/data-formats#config). This
+[spaCy's config system](https://spacy.io/api/data-formats#config). This package
-package provides various built-in functionality, as detailed in the [API](#-api)
+provides various built-in functionality, as detailed in the [API](#-api)
 documentation.
 ### Example 1: Add a text classifier using a GPT-3 model from OpenAI {id="example-1"}
@ -114,10 +113,9 @@ factory = "llm"
@llm_tasks = "spacy.TextCat.v2"
 labels = ["COMPLIMENT", "INSULT"]
-[components.llm.backend]
+[components.llm.model]
-@llm_backends = "spacy.REST.v1"
+@llm_models = "spacy.GPT-3-5.v1"
-api = "OpenAI"
+config = {"temperature": 0.3}
 config = {"model": "gpt-3.5-turbo", "temperature": 0.3}
 ```
 Now run:
@ -153,10 +151,10 @@ factory = "llm"
@llm_tasks = "spacy.NER.v2"
 labels = ["PERSON", "ORGANISATION", "LOCATION"]
-[components.llm.backend]
+[components.llm.model]
-@llm_backends = "spacy.Dolly_HF.v1"
+@llm_models = "spacy.Dolly.v1"
-# For better performance, use databricks/dolly-v2-12b instead
+# For better performance, use dolly-v2-12b instead
-model = "databricks/dolly-v2-3b"
+name = "dolly-v2-3b"
 ```
 Now run:
@ -191,10 +189,8 @@ nlp.add_pipe(
            "@llm_tasks": "spacy.NER.v2",
            "labels": ["PERSON", "ORGANISATION", "LOCATION"]
        },
-        "backend": {
+        "model": {
-            "@llm_backends": "spacy.REST.v1",
+            "@llm_models": "spacy.gpt-3.5.v1",
            "api": "OpenAI",
            "config": {"model": "gpt-3.5-turbo"},
        },
    },
 )
@ -312,7 +308,7 @@ Text:
 You look gorgeous!
 '''
-Backend response for doc: You look gorgeous!
+Model response for doc: You look gorgeous!
 COMPLIMENT
 ```
@ -324,8 +320,8 @@ COMPLIMENT
 ## API {id="api"}
-`spacy-llm` exposes a `llm` factory with [configurable settings](api/large-language-models#config). 
+`spacy-llm` exposes a `llm` factory with
-
+[configurable settings](api/large-language-models#config).
 An `llm` component is defined by two main settings:
@ -372,7 +368,8 @@ method is defined, `spacy-llm` will call it to evaluate the component.
 | --------------------------------------------------------------------------- | --------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
 | [`task.generate_prompts`](/api/large-language-models#task-generate-prompts) | Takes a collection of documents, and returns a collection of "prompts", which can be of type `Any`.                                                                   |
 | [`task.parse_responses`](/api/large-language-models#task-parse-responses)   | Takes a collection of LLM responses and the original documents, parses the responses into structured information, and sets the annotations on the documents.          |
-| [`spacy.NER.v2`](/api/large-language-models#ner-v2)                         | The built-in NER task supports both zero-shot and few-shot prompting.                                                                                                 |
+| [`spacy.Summarization.v1`](/api/large-language-models#summarization-v1)     | The summarization task prompts the model for a concise summary of the provided text.                                                                                  |
 | [`spacy.NER.v2`](/api/large-language-models#ner-v2)                         | The built-in NER task supports both zero-shot and few-shot prompting. This version also supports explicitly defining the provided labels with custom descriptions.    |
 | [`spacy.NER.v1`](/api/large-language-models#ner-v1)                         | The original version of the built-in NER task supports both zero-shot and few-shot prompting.                                                                         |
 | [`spacy.SpanCat.v2`](/api/large-language-models#spancat-v2)                 | The built-in SpanCat task is a simple adaptation of the NER task to support overlapping entities and store its annotations in `doc.spans`.                            |
 | [`spacy.SpanCat.v1`](/api/large-language-models#spancat-v1)                 | The original version of the built-in SpanCat task is a simple adaptation of the v1 NER task to support overlapping entities and store its annotations in `doc.spans`. |
@ -381,11 +378,12 @@ method is defined, `spacy-llm` will call it to evaluate the component.
 | [`spacy.TextCat.v1`](/api/large-language-models#textcat-v1)                 | Version 1 of the built-in TextCat task supports both zero-shot and few-shot prompting.                                                                                |
 | [`spacy.REL.v1`](/api/large-language-models#rel-v1)                         | The built-in REL task supports both zero-shot and few-shot prompting. It relies on an upstream NER component for entities extraction.                                 |
 | [`spacy.Lemma.v1`](/api/large-language-models#lemma-v1)                     | The `Lemma.v1` task lemmatizes the provided text and updates the `lemma_` attribute in the doc's tokens accordingly.                                                  |
 | [`spacy.Sentiment.v1`](/api/large-language-models#sentiment-v1)             | Performs sentiment analysis on provided texts.                                                                                                                        |
 | [`spacy.NoOp.v1`](/api/large-language-models#noop-v1)                       | This task is only useful for testing - it tells the LLM to do nothing, and does not set any fields on the `docs`.                                                     |
-### Backends {id="backends"}
+### Model {id="models"}
-A _backend_ defines which LLM model to query, and how to query it. It can be a
+A _model_ defines which LLM model to query, and how to query it. It can be a
 simple function taking a collection of prompts (consistent with the output type
 of `task.generate_prompts()`) and returning a collection of responses
 (consistent with the expected input of `parse_responses`). Generally speaking,
@ -393,52 +391,101 @@ it's a function of type `Callable[[Iterable[Any]], Iterable[Any]]`, but specific
 implementations can have other signatures, like
 `Callable[[Iterable[str]], Iterable[str]]`.
-All built-in backends are registered in `llm_backends`. If no backend is
+All built-in models are registered in `llm_models`. If no model is specified,
-specified, the repo currently connects to the [`OpenAI` API](#openai) by
+the repo currently connects to the `OpenAI` API by default using REST, and
-default, using the built-in REST protocol, and accesses the `"gpt-3.5-turbo"`
+accesses the `"gpt-3.5-turbo"` model.
-model.
+
 Currently three different approaches to use LLMs are supported:
 1. `spacy-llm`s native REST backend. This is the default for all hosted models
   (e. g. OpenAI, Cohere, Anthropic, ...).
 2. A HuggingFace integration that allows to run a limited set of HF models
   locally.
 3. A LangChain integration that allows to run any model supported by LangChain
   (hosted or locally).
 Approaches 1. and 2 are the default for hosted model and local models,
 respectively. Alternatively you can use LangChain to access hosted or local
 models by specifying one of the models registered with the `langchain.` prefix.
 <Infobox>
-_Why are there backends for third-party libraries in addition to a
+_Why LangChain if there are also are a native REST and a HuggingFace backend? When should I use what?_
 native REST backend and which should I choose?_
-Third-party libraries like `langchain` or `minichain` focus on prompt
+Third-party libraries like `langchain` focus on prompt management, integration
-management, integration of many different LLM APIs, and other related features
+of many different LLM APIs, and other related features such as conversational
-such as conversational memory or agents. `spacy-llm` on the other hand
+memory or agents. `spacy-llm` on the other hand emphasizes features we consider
-emphasizes features we consider useful in the context of NLP pipelines utilizing
+useful in the context of NLP pipelines utilizing LLMs to process documents
-LLMs to process documents (mostly) independent from each other. It makes sense
+(mostly) independent from each other. It makes sense that the feature sets of
-that the feature set of such third-party libraries and `spacy-llm` is not
+such third-party libraries and `spacy-llm` aren't identical - and users might
-identical - and users might want to take advantage of features not available in
+want to take advantage of features not available in `spacy-llm`.
 `spacy-llm`.
-The advantage of offering our own REST backend is that we can ensure a larger
+The advantage of implementing our own REST and HuggingFace integrations is that
-degree of stability of robustness, as we can guarantee backwards-compatibility
+we can ensure a larger degree of stability and robustness, as we can guarantee
-and more smoothly integrated error handling.
+backwards-compatibility and more smoothly integrated error handling.
-Ultimately we recommend trying to implement your use case using the REST backend
+If however there are features or APIs not natively covered by `spacy-llm`, it's
-first (which is configured as the default backend). If however there are
+trivial to utilize LangChain to cover this - and easy to customize the prompting
 features or APIs not covered by `spacy-llm`, it's trivial to switch to the
 backend of a third-party library - and easy to customize the prompting
 mechanism, if so required.
 </Infobox>
 Note that when using hosted services, you have to ensure that the proper API
 keys are set as environment variables as described by the corresponding
 provider's documentation.
 E. g. when using OpenAI, you have to get an API key from openai.com, and ensure
 that the keys are set as environmental variables:
 ```shell
 export OPENAI_API_KEY="sk-..."
 export OPENAI_API_ORG="org-..."
 ```
 For Cohere it's
 ```shell
 export CO_API_KEY="..."
 ```
 and for Anthropic
 ```shell
 export ANTHROPIC_API_KEY="..."
 ```
 | Component                                                                      | Description                                                                          |
-| ------------------------------------------------------------------- | ----------------------------------------------------------------------------------- |
+| ------------------------------------------------------------------------------ | ------------------------------------------------------------------------------------ |
-| [`OpenAI`](/api/large-language-models#openai)                       | ??                                                                                  |
+| [`spacy.GPT-4.v1`](/api/large-language-models#gpt-4)                           | OpenAI’s `gpt-4` model family.                                                       |
-| [`spacy.REST.v1`](/api/large-language-models#rest-v1)               | This default backend uses `requests` and a simple retry mechanism to access an API. |
+| [`spacy.GPT-3-5.v1`](/api/large-language-models#gpt-3-5)                       | OpenAI’s `gpt-3-5` model family.                                                     |
-| [`spacy.MiniChain.v1`](/api/large-language-models#minichain-v1)     | Use [MiniChain](https://github.com/srush/MiniChain) for the API retrieval.          |
+| [`spacy.Text-Davinci.v1`](/api/large-language-models#text-davinci)             | OpenAI’s `text-davinci` model family.                                                |
-| [`spacy.LangChain.v1`](/api/large-language-models#langchain-v1)     | Use [LangChain](https://github.com/hwchase17/langchain) for the API retrieval.      |
+| [`spacy.Code-Davinci.v1`](/api/large-language-models#code-davinci)             | OpenAI’s `code-davinci` model family.                                                |
-| [`spacy.Dolly_HF.v1`](/api/large-language-models#dollyhf-v1)        | Use [LangChain](https://github.com/hwchase17/langchain) for the API retrieval.      |
+| [`spacy.Text-Curie.v1`](/api/large-language-models#text-curie)                 | OpenAI’s `text-curie` model family.                                                  |
-| [`spacy.StableLM_HF.v1`](/api/large-language-models#stablelmhf-v1)  | Use [LangChain](https://github.com/hwchase17/langchain) for the API retrieval.      |
+| [`spacy.Text-Babbage.v1`](/api/large-language-models#text-babbage)             | OpenAI’s `text-babbage` model family.                                                |
-| [`spacy.OpenLLaMaHF.v1`](/api/large-language-models#openllamahf-v1) | Use [LangChain](https://github.com/hwchase17/langchain) for the API retrieval.      |
+| [`spacy.Text-Ada.v1`](/api/large-language-models#text-ada)                     | OpenAI’s `text-ada` model family.                                                    |
 | [`spacy.Davinci.v1`](/api/large-language-models#davinci)                       | OpenAI’s `davinci` model family.                                                     |
 | [`spacy.Curie.v1`](/api/large-language-models#curie)                           | OpenAI’s `curie` model family.                                                       |
 | [`spacy.Babbage.v1`](/api/large-language-models#babbage)                       | OpenAI’s `babbage` model family.                                                     |
 | [`spacy.Ada.v1`](/api/large-language-models#ada)                               | OpenAI’s `ada` model family.                                                         |
 | [`spacy.Command.v1`](/api/large-language-models#command)                       | Cohere’s `command` model family.                                                     |
 | [`spacy.Claude-1.v1`](/api/large-language-models#claude-1)                     | Anthropic’s `claude-1` model family.                                                 |
 | [`spacy.Claude-instant-1.v1`](/api/large-language-models#claude-instant-1)     | Anthropic’s `claude-instant-1` model family.                                         |
 | [`spacy.Claude-instant-1-1.v1`](/api/large-language-models#claude-instant-1-1) | Anthropic’s `claude-instant-1.1` model family.                                       |
 | [`spacy.Claude-1-0.v1`](/api/large-language-models#claude-1-0)                 | Anthropic’s `claude-1.0` model family.                                               |
 | [`spacy.Claude-1-2.v1`](/api/large-language-models#claude-1-2)                 | Anthropic’s `claude-1.2` model family.                                               |
 | [`spacy.Claude-1-3.v1`](/api/large-language-models#claude-1-3)                 | Anthropic’s `claude-1.3` model family.                                               |
 | [`spacy.Dolly.v1`](/api/large-language-models#dolly)                           | Dolly models through [Databricks](https://huggingface.co/databricks) on HuggingFace. |
 | [`spacy.Falcon.v1`](/api/large-language-models#falcon)                         | Falcon model through HuggingFace.                                                    |
 | [`spacy.StableLM.v1`](/api/large-language-models#stablelm)                     | StableLM model through HuggingFace.                                                  |
 | [`spacy.OpenLLaMA.v1`](/api/large-language-models#openllama)                   | OpenLLaMA model through HuggingFace.                                                 |
 | [LangChain models](/api/large-language-models#langchain-models)                | LangChain models for API retrieval.                                                  |
 ### Cache {id="cache"}
 Interacting with LLMs, either through an external API or a local instance, is
 costly. Since developing an NLP pipeline generally means a lot of exploration
-and prototyping, `spacy-llm` implements a built-in [cache](/api/large-language-models#cache) to avoid reprocessing
+and prototyping, `spacy-llm` implements a built-in
-the same documents at each run that keeps batches of documents stored on disk.
+[cache](/api/large-language-models#cache) to avoid reprocessing the same
 documents at each run that keeps batches of documents stored on disk.
 ### Various functions {id="various-functions"}