Merge branch 'docs/llm_develop' into docs/llm-translation-task

# Conflicts:
#	website/docs/api/large-language-models.mdx
This commit is contained in:
Raphael Mitsch 2023-12-11 17:21:40 +01:00
commit d981c27c93

View File

@ -236,6 +236,13 @@ objects. This depends on the return type of the [model](#models).
| `responses` | The generated prompts. ~~Iterable[Any]~~ |
| **RETURNS** | The annotated documents. ~~Iterable[Doc]~~ |
### Raw prompting {id="raw"}
Different to all other tasks `spacy.Raw.vX` doesn't provide a specific prompt,
wrapping doc data, to the model. Instead it instructs the model to reply to the
doc content. This is handy for use cases like question answering (where each doc
contains one question) or if you want to include customized prompts for each doc.
### Translation {id="translation"}
The translation task translates texts from a defined or inferred source to a
@ -287,6 +294,60 @@ target_lang = "Spanish"
path = "translation_examples.yml"
```
#### spacy.Raw.v1 {id="raw-v1"}
Note that since this task may request arbitrary information, it doesn't do any
parsing per se - the model response is stored in a custom `Doc` attribute (i. e.
can be accessed via `doc._.{field}`).
It supports both zero-shot and few-shot prompting.
> #### Example config
>
> ```ini
> [components.llm.task]
> @llm_tasks = "spacy.Raw.v1"
> examples = null
> ```
| Argument | Description |
| --------------------- | ------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
| `template` | Custom prompt template to send to LLM model. Defaults to [raw.v1.jinja](https://github.com/explosion/spacy-llm/blob/main/spacy_llm/tasks/templates/raw.v1.jinja). ~~str~~ |
| `examples` | Optional function that generates examples for few-shot learning. Defaults to `None`. ~~Optional[Callable[[], Iterable[Any]]]~~ |
| `parse_responses` | Callable for parsing LLM responses for this task. Defaults to the internal parsing method for this task. ~~Optional[TaskResponseParser[RawTask]]~~ |
| `prompt_example_type` | Type to use for fewshot examples. Defaults to `RawExample`. ~~Optional[Type[FewshotExample]]~~ |
| `field` | Name of extension attribute to store model reply in (i. e. the reply will be available in `doc._.{field}`). Defaults to `reply`. ~~str~~ |
To perform [few-shot learning](/usage/large-language-models#few-shot-prompts),
you can write down a few examples in a separate file, and provide these to be
injected into the prompt to the LLM. The default reader `spacy.FewShotReader.v1`
supports `.yml`, `.yaml`, `.json` and `.jsonl`.
```yaml
# Each example can follow an arbitrary pattern. It might help the prompt performance though if the examples resemble
# the actual docs' content.
- text: "3 + 5 = x. What's x?"
reply: '8'
- text: 'Write me a limerick.'
reply:
"There was an Old Man with a beard, Who said, 'It is just as I feared! Two
Owls and a Hen, Four Larks and a Wren, Have all built their nests in my
beard!"
- text: "Analyse the sentiment of the text 'This is great'."
reply: "'This is great' expresses a very positive sentiment."
```
```ini
[components.llm.task]
@llm_tasks = "spacy.Raw.v1"
field = "llm_reply"
[components.llm.task.examples]
@misc = "spacy.FewShotReader.v1"
path = "raw_examples.yml"
```
### Summarization {id="summarization"}
A summarization task takes a document as input and generates a summary that is