diff --git a/website/docs/usage/v3.md b/website/docs/usage/v3.md index 44810da58..72971dce2 100644 --- a/website/docs/usage/v3.md +++ b/website/docs/usage/v3.md @@ -709,6 +709,48 @@ nlp = spacy.blank("en") + nlp.add_pipe("ner", source=source_nlp) ``` +#### Configuring pipeline components with settings {#migrating-configure-pipe} + +Because pipeline components are now added using their string names, you won't +have to instantiate the [component classes](/api/#architecture-pipeline) +directly anynore. To configure the component, you can now use the `config` +argument on [`nlp.add_pipe`](/api/language#add_pipe). + +> #### config.cfg (excerpt) +> +> ```ini +> [components.sentencizer] +> factory = "sentencizer" +> punct_chars = ["!", ".", "?"] +> ``` + +```diff +punct_chars = ["!", ".", "?"] +- sentencizer = Sentencizer(punct_chars=punct_chars) ++ sentencizer = nlp.add_pipe("sentencizer", config={"punct_chars": punct_chars}) +``` + +The `config` corresponds to the component settings in the +[`config.cfg`](/usage/training#config-components) and will overwrite the default +config defined by the components. + + + +Config values you pass to components **need to be JSON-serializable** and can't +be arbitrary Python objects. Otherwise, the settings you provide can't be +represented in the `config.cfg` and spaCy has no way of knowing how to re-create +your component with the same settings when you load the pipeline back in. If you +need to pass arbitrary objects to a component, use a +[registered function](/usage/processing-pipelines#example-stateful-components): + +```diff +- config = {"model": MyTaggerModel()} ++ config= {"model": {"@architectures": "MyTaggerModel"}} +tagger = nlp.add_pipe("tagger", config=config) +``` + + + ### Adding match patterns {#migrating-matcher} The [`Matcher.add`](/api/matcher#add),