update for v0.4.0

This commit is contained in:
Victoria Slocum 2023-07-10 10:25:21 +02:00
parent b552b4f60a
commit 682dec77cf
2 changed files with 846 additions and 290 deletions

File diff suppressed because it is too large Load Diff

View File

@ -21,10 +21,9 @@ required.
- Serializable `llm` **component** to integrate prompts into your pipeline
- **Modular functions** to define the [**task**](#tasks) (prompting and parsing)
and [**backend**](#backends) (model to use)
and [**model**](#models) (model to use)
- Support for **hosted APIs** and self-hosted **open-source models**
- Integration with [`MiniChain`](https://github.com/srush/MiniChain) and
[`LangChain`](https://github.com/hwchase17/langchain)
- Integration with [`LangChain`](https://github.com/hwchase17/langchain)
- Access to
**[OpenAI API](https://platform.openai.com/docs/api-reference/introduction)**,
including GPT-4 and various GPT-3 models
@ -85,9 +84,9 @@ python -m pip install spacy-llm
## Usage {id="usage"}
The task and the backend have to be supplied to the `llm` pipeline component
using [spaCy's config system](https://spacy.io/api/data-formats#config). This
package provides various built-in functionality, as detailed in the [API](#-api)
The task and the model have to be supplied to the `llm` pipeline component using
[spaCy's config system](https://spacy.io/api/data-formats#config). This package
provides various built-in functionality, as detailed in the [API](#-api)
documentation.
### Example 1: Add a text classifier using a GPT-3 model from OpenAI {id="example-1"}
@ -114,10 +113,9 @@ factory = "llm"
@llm_tasks = "spacy.TextCat.v2"
labels = ["COMPLIMENT", "INSULT"]
[components.llm.backend]
@llm_backends = "spacy.REST.v1"
api = "OpenAI"
config = {"model": "gpt-3.5-turbo", "temperature": 0.3}
[components.llm.model]
@llm_models = "spacy.GPT-3-5.v1"
config = {"temperature": 0.3}
```
Now run:
@ -153,10 +151,10 @@ factory = "llm"
@llm_tasks = "spacy.NER.v2"
labels = ["PERSON", "ORGANISATION", "LOCATION"]
[components.llm.backend]
@llm_backends = "spacy.Dolly_HF.v1"
# For better performance, use databricks/dolly-v2-12b instead
model = "databricks/dolly-v2-3b"
[components.llm.model]
@llm_models = "spacy.Dolly.v1"
# For better performance, use dolly-v2-12b instead
name = "dolly-v2-3b"
```
Now run:
@ -191,10 +189,8 @@ nlp.add_pipe(
"@llm_tasks": "spacy.NER.v2",
"labels": ["PERSON", "ORGANISATION", "LOCATION"]
},
"backend": {
"@llm_backends": "spacy.REST.v1",
"api": "OpenAI",
"config": {"model": "gpt-3.5-turbo"},
"model": {
"@llm_models": "spacy.gpt-3.5.v1",
},
},
)
@ -312,7 +308,7 @@ Text:
You look gorgeous!
'''
Backend response for doc: You look gorgeous!
Model response for doc: You look gorgeous!
COMPLIMENT
```
@ -324,8 +320,8 @@ COMPLIMENT
## API {id="api"}
`spacy-llm` exposes a `llm` factory with [configurable settings](api/large-language-models#config).
`spacy-llm` exposes a `llm` factory with
[configurable settings](api/large-language-models#config).
An `llm` component is defined by two main settings:
@ -372,7 +368,8 @@ method is defined, `spacy-llm` will call it to evaluate the component.
| --------------------------------------------------------------------------- | --------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
| [`task.generate_prompts`](/api/large-language-models#task-generate-prompts) | Takes a collection of documents, and returns a collection of "prompts", which can be of type `Any`. |
| [`task.parse_responses`](/api/large-language-models#task-parse-responses) | Takes a collection of LLM responses and the original documents, parses the responses into structured information, and sets the annotations on the documents. |
| [`spacy.NER.v2`](/api/large-language-models#ner-v2) | The built-in NER task supports both zero-shot and few-shot prompting. |
| [`spacy.Summarization.v1`](/api/large-language-models#summarization-v1) | The summarization task prompts the model for a concise summary of the provided text. |
| [`spacy.NER.v2`](/api/large-language-models#ner-v2) | The built-in NER task supports both zero-shot and few-shot prompting. This version also supports explicitly defining the provided labels with custom descriptions. |
| [`spacy.NER.v1`](/api/large-language-models#ner-v1) | The original version of the built-in NER task supports both zero-shot and few-shot prompting. |
| [`spacy.SpanCat.v2`](/api/large-language-models#spancat-v2) | The built-in SpanCat task is a simple adaptation of the NER task to support overlapping entities and store its annotations in `doc.spans`. |
| [`spacy.SpanCat.v1`](/api/large-language-models#spancat-v1) | The original version of the built-in SpanCat task is a simple adaptation of the v1 NER task to support overlapping entities and store its annotations in `doc.spans`. |
@ -381,11 +378,12 @@ method is defined, `spacy-llm` will call it to evaluate the component.
| [`spacy.TextCat.v1`](/api/large-language-models#textcat-v1) | Version 1 of the built-in TextCat task supports both zero-shot and few-shot prompting. |
| [`spacy.REL.v1`](/api/large-language-models#rel-v1) | The built-in REL task supports both zero-shot and few-shot prompting. It relies on an upstream NER component for entities extraction. |
| [`spacy.Lemma.v1`](/api/large-language-models#lemma-v1) | The `Lemma.v1` task lemmatizes the provided text and updates the `lemma_` attribute in the doc's tokens accordingly. |
| [`spacy.Sentiment.v1`](/api/large-language-models#sentiment-v1) | Performs sentiment analysis on provided texts. |
| [`spacy.NoOp.v1`](/api/large-language-models#noop-v1) | This task is only useful for testing - it tells the LLM to do nothing, and does not set any fields on the `docs`. |
### Backends {id="backends"}
### Model {id="models"}
A _backend_ defines which LLM model to query, and how to query it. It can be a
A _model_ defines which LLM model to query, and how to query it. It can be a
simple function taking a collection of prompts (consistent with the output type
of `task.generate_prompts()`) and returning a collection of responses
(consistent with the expected input of `parse_responses`). Generally speaking,
@ -393,52 +391,101 @@ it's a function of type `Callable[[Iterable[Any]], Iterable[Any]]`, but specific
implementations can have other signatures, like
`Callable[[Iterable[str]], Iterable[str]]`.
All built-in backends are registered in `llm_backends`. If no backend is
specified, the repo currently connects to the [`OpenAI` API](#openai) by
default, using the built-in REST protocol, and accesses the `"gpt-3.5-turbo"`
model.
All built-in models are registered in `llm_models`. If no model is specified,
the repo currently connects to the `OpenAI` API by default using REST, and
accesses the `"gpt-3.5-turbo"` model.
Currently three different approaches to use LLMs are supported:
1. `spacy-llm`s native REST backend. This is the default for all hosted models
(e. g. OpenAI, Cohere, Anthropic, ...).
2. A HuggingFace integration that allows to run a limited set of HF models
locally.
3. A LangChain integration that allows to run any model supported by LangChain
(hosted or locally).
Approaches 1. and 2 are the default for hosted model and local models,
respectively. Alternatively you can use LangChain to access hosted or local
models by specifying one of the models registered with the `langchain.` prefix.
<Infobox>
_Why are there backends for third-party libraries in addition to a
native REST backend and which should I choose?_
_Why LangChain if there are also are a native REST and a HuggingFace backend? When should I use what?_
Third-party libraries like `langchain` or `minichain` focus on prompt
management, integration of many different LLM APIs, and other related features
such as conversational memory or agents. `spacy-llm` on the other hand
emphasizes features we consider useful in the context of NLP pipelines utilizing
LLMs to process documents (mostly) independent from each other. It makes sense
that the feature set of such third-party libraries and `spacy-llm` is not
identical - and users might want to take advantage of features not available in
`spacy-llm`.
Third-party libraries like `langchain` focus on prompt management, integration
of many different LLM APIs, and other related features such as conversational
memory or agents. `spacy-llm` on the other hand emphasizes features we consider
useful in the context of NLP pipelines utilizing LLMs to process documents
(mostly) independent from each other. It makes sense that the feature sets of
such third-party libraries and `spacy-llm` aren't identical - and users might
want to take advantage of features not available in `spacy-llm`.
The advantage of offering our own REST backend is that we can ensure a larger
degree of stability of robustness, as we can guarantee backwards-compatibility
and more smoothly integrated error handling.
The advantage of implementing our own REST and HuggingFace integrations is that
we can ensure a larger degree of stability and robustness, as we can guarantee
backwards-compatibility and more smoothly integrated error handling.
Ultimately we recommend trying to implement your use case using the REST backend
first (which is configured as the default backend). If however there are
features or APIs not covered by `spacy-llm`, it's trivial to switch to the
backend of a third-party library - and easy to customize the prompting
If however there are features or APIs not natively covered by `spacy-llm`, it's
trivial to utilize LangChain to cover this - and easy to customize the prompting
mechanism, if so required.
</Infobox>
| Component | Description |
| ------------------------------------------------------------------- | ----------------------------------------------------------------------------------- |
| [`OpenAI`](/api/large-language-models#openai) | ?? |
| [`spacy.REST.v1`](/api/large-language-models#rest-v1) | This default backend uses `requests` and a simple retry mechanism to access an API. |
| [`spacy.MiniChain.v1`](/api/large-language-models#minichain-v1) | Use [MiniChain](https://github.com/srush/MiniChain) for the API retrieval. |
| [`spacy.LangChain.v1`](/api/large-language-models#langchain-v1) | Use [LangChain](https://github.com/hwchase17/langchain) for the API retrieval. |
| [`spacy.Dolly_HF.v1`](/api/large-language-models#dollyhf-v1) | Use [LangChain](https://github.com/hwchase17/langchain) for the API retrieval. |
| [`spacy.StableLM_HF.v1`](/api/large-language-models#stablelmhf-v1) | Use [LangChain](https://github.com/hwchase17/langchain) for the API retrieval. |
| [`spacy.OpenLLaMaHF.v1`](/api/large-language-models#openllamahf-v1) | Use [LangChain](https://github.com/hwchase17/langchain) for the API retrieval. |
Note that when using hosted services, you have to ensure that the proper API
keys are set as environment variables as described by the corresponding
provider's documentation.
E. g. when using OpenAI, you have to get an API key from openai.com, and ensure
that the keys are set as environmental variables:
```shell
export OPENAI_API_KEY="sk-..."
export OPENAI_API_ORG="org-..."
```
For Cohere it's
```shell
export CO_API_KEY="..."
```
and for Anthropic
```shell
export ANTHROPIC_API_KEY="..."
```
| Component | Description |
| ------------------------------------------------------------------------------ | ------------------------------------------------------------------------------------ |
| [`spacy.GPT-4.v1`](/api/large-language-models#gpt-4) | OpenAIs `gpt-4` model family. |
| [`spacy.GPT-3-5.v1`](/api/large-language-models#gpt-3-5) | OpenAIs `gpt-3-5` model family. |
| [`spacy.Text-Davinci.v1`](/api/large-language-models#text-davinci) | OpenAIs `text-davinci` model family. |
| [`spacy.Code-Davinci.v1`](/api/large-language-models#code-davinci) | OpenAIs `code-davinci` model family. |
| [`spacy.Text-Curie.v1`](/api/large-language-models#text-curie) | OpenAIs `text-curie` model family. |
| [`spacy.Text-Babbage.v1`](/api/large-language-models#text-babbage) | OpenAIs `text-babbage` model family. |
| [`spacy.Text-Ada.v1`](/api/large-language-models#text-ada) | OpenAIs `text-ada` model family. |
| [`spacy.Davinci.v1`](/api/large-language-models#davinci) | OpenAIs `davinci` model family. |
| [`spacy.Curie.v1`](/api/large-language-models#curie) | OpenAIs `curie` model family. |
| [`spacy.Babbage.v1`](/api/large-language-models#babbage) | OpenAIs `babbage` model family. |
| [`spacy.Ada.v1`](/api/large-language-models#ada) | OpenAIs `ada` model family. |
| [`spacy.Command.v1`](/api/large-language-models#command) | Coheres `command` model family. |
| [`spacy.Claude-1.v1`](/api/large-language-models#claude-1) | Anthropics `claude-1` model family. |
| [`spacy.Claude-instant-1.v1`](/api/large-language-models#claude-instant-1) | Anthropics `claude-instant-1` model family. |
| [`spacy.Claude-instant-1-1.v1`](/api/large-language-models#claude-instant-1-1) | Anthropics `claude-instant-1.1` model family. |
| [`spacy.Claude-1-0.v1`](/api/large-language-models#claude-1-0) | Anthropics `claude-1.0` model family. |
| [`spacy.Claude-1-2.v1`](/api/large-language-models#claude-1-2) | Anthropics `claude-1.2` model family. |
| [`spacy.Claude-1-3.v1`](/api/large-language-models#claude-1-3) | Anthropics `claude-1.3` model family. |
| [`spacy.Dolly.v1`](/api/large-language-models#dolly) | Dolly models through [Databricks](https://huggingface.co/databricks) on HuggingFace. |
| [`spacy.Falcon.v1`](/api/large-language-models#falcon) | Falcon model through HuggingFace. |
| [`spacy.StableLM.v1`](/api/large-language-models#stablelm) | StableLM model through HuggingFace. |
| [`spacy.OpenLLaMA.v1`](/api/large-language-models#openllama) | OpenLLaMA model through HuggingFace. |
| [LangChain models](/api/large-language-models#langchain-models) | LangChain models for API retrieval. |
### Cache {id="cache"}
Interacting with LLMs, either through an external API or a local instance, is
costly. Since developing an NLP pipeline generally means a lot of exploration
and prototyping, `spacy-llm` implements a built-in [cache](/api/large-language-models#cache) to avoid reprocessing
the same documents at each run that keeps batches of documents stored on disk.
and prototyping, `spacy-llm` implements a built-in
[cache](/api/large-language-models#cache) to avoid reprocessing the same
documents at each run that keeps batches of documents stored on disk.
### Various functions {id="various-functions"}