* Update processing-pipelines.md
Under "things to try," inform users they can save metadata when using nlp.pipe(foobar, as_tuples=True)
Link to a new example on the attributes page detailing the following:
> ```
> data = [
> ("Some text to process", {"meta": "foo"}),
> ("And more text...", {"meta": "bar"})
> ]
>
> for doc, context in nlp.pipe(data, as_tuples=True):
> # Let's assume you have a "meta" extension registered on the Doc
> doc._.meta = context["meta"]
> ```
from https://stackoverflow.com/questions/57058798/make-spacy-nlp-pipe-process-tuples-of-text-and-additional-information-to-add-as
* Updating the attributes section
Update the attributes section with example of how extensions can be used to store metadata.
* Update processing-pipelines.md
* Update processing-pipelines.md
Made as_tuples example executable and relocated to the end of the "Processing Text" section.
* Update processing-pipelines.md
* Update processing-pipelines.md
Removed extra line
* Reformat and rephrase
Co-authored-by: Adriane Boyd <adrianeboyd@gmail.com>
* add multi-label textcat to menu
* add infobox on textcat API
* add info to v3 migration guide
* small edits
* further fixes in doc strings
* add infobox to textcat architectures
* add textcat_multilabel to overview of built-in components
* spelling
* fix unrelated warn msg
* Add textcat_multilabel to quickstart [ci skip]
* remove separate documentation page for multilabel_textcategorizer
* small edits
* positive label clarification
* avoid duplicating information in self.cfg and fix textcat.score
* fix multilabel textcat too
* revert threshold to storage in cfg
* revert threshold stuff for multi-textcat
Co-authored-by: Ines Montani <ines@ines.io>
* initialize NLP with train corpus
* add more pretraining tests
* more tests
* function to fetch tok2vec layer for pretraining
* clarify parameter name
* test different objectives
* formatting
* fix check for static vectors when using vectors objective
* clarify docs
* logger statement
* fix init_tok2vec and proc.initialize order
* test training after pretraining
* add init_config tests for pretraining
* pop pretraining block to avoid config validation errors
* custom errors