From d56ee65ddffb00c147e5668087c054c4d6b6072a Mon Sep 17 00:00:00 2001 From: Raphael Mitsch Date: Mon, 11 Dec 2023 17:41:04 +0100 Subject: [PATCH] Document `spacy-llm`'s `TranslationTask` (#13183) * Describe translation task. * Fix references to examples and template. * Format. --- website/docs/api/large-language-models.mdx | 54 +++++++++++++++++++++- 1 file changed, 53 insertions(+), 1 deletion(-) diff --git a/website/docs/api/large-language-models.mdx b/website/docs/api/large-language-models.mdx index 4c303e7c8..d658e9dda 100644 --- a/website/docs/api/large-language-models.mdx +++ b/website/docs/api/large-language-models.mdx @@ -241,7 +241,59 @@ objects. This depends on the return type of the [model](#models). Different to all other tasks `spacy.Raw.vX` doesn't provide a specific prompt, wrapping doc data, to the model. Instead it instructs the model to reply to the doc content. This is handy for use cases like question answering (where each doc -contains one question) or if you want to include customized prompts for each doc. +contains one question) or if you want to include customized prompts for each +doc. + +### Translation {id="translation"} + +The translation task translates texts from a defined or inferred source to a +defined target language. + +#### spacy.Translation.v1 {id="translation-v1"} + +`spacy.Translation.v1` supports both zero-shot and few-shot prompting. + +> #### Example config +> +> ```ini +> [components.llm.task] +> @llm_tasks = "spacy.Translation.v1" +> examples = null +> target_lang = "Spanish" +> ``` + +| Argument | Description | +| --------------------------- | ----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | +| `template` | Custom prompt template to send to LLM model. Defaults to [translation.v1.jinja](https://github.com/explosion/spacy-llm/blob/main/spacy_llm/tasks/templates/translation.v1.jinja). ~~str~~ | +| `examples` | Optional function that generates examples for few-shot learning. Defaults to `None`. ~~Optional[Callable[[], Iterable[Any]]]~~ | +| `parse_responses` (NEW) | Callable for parsing LLM responses for this task. Defaults to the internal parsing method for this task. ~~Optional[TaskResponseParser[TranslationTask]]~~ | +| `prompt_example_type` (NEW) | Type to use for fewshot examples. Defaults to `TranslationExample`. ~~Optional[Type[FewshotExample]]~~ | +| `source_lang` | Language to translate from. Doesn't have to be set. ~~Optional[str]~~ | +| `target_lang` | Language to translate to. No default value, has to be set. ~~str~~ | +| `field` | Name of extension attribute to store translation in (i. e. the translation will be available in `doc._.{field}`). Defaults to `translation`. ~~str~~ | + +To perform [few-shot learning](/usage/large-language-models#few-shot-prompts), +you can write down a few examples in a separate file, and provide these to be +injected into the prompt to the LLM. The default reader `spacy.FewShotReader.v1` +supports `.yml`, `.yaml`, `.json` and `.jsonl`. + +```yaml +- text: 'Top of the morning to you!' + translation: '¡Muy buenos días!' +- text: 'The weather is great today.' + translation: 'El clima está fantástico hoy.' +- text: 'Do you know what will happen tomorrow?' + translation: '¿Sabes qué pasará mañana?' +``` + +```ini +[components.llm.task] +@llm_tasks = "spacy.Translation.v1" +target_lang = "Spanish" +[components.llm.task.examples] +@misc = "spacy.FewShotReader.v1" +path = "translation_examples.yml" +``` #### spacy.Raw.v1 {id="raw-v1"}