spaCy/website/docs/api/language.md

---
title: Language
teaser: A text-processing pipeline
tag: class
source: spacy/language.py
---

Usually you'll load this once per process as `nlp` and pass the instance around
your application. The `Language` class is created when you call
[`spacy.load()`](/api/top-level#spacy.load) and contains the shared vocabulary
and [language data](/usage/adding-languages), optional model data loaded from a
[model package](/models) or a path, and a
[processing pipeline](/usage/processing-pipelines) containing components like
the tagger or parser that are called on a document in order. You can also add
your own processing pipeline components that take a `Doc` object, modify it and
return it.

## Language.\_\_init\_\_ {#init tag="method"}

Initialize a `Language` object.

> #### Example
>
> ```python
> from spacy.vocab import Vocab
> from spacy.language import Language
> nlp = Language(Vocab())
>
> from spacy.lang.en import English
> nlp = English()
> ```

| Name        | Type       | Description                                                                                |
| ----------- | ---------- | ------------------------------------------------------------------------------------------ |
| `vocab`     | `Vocab`    | A `Vocab` object. If `True`, a vocab is created via `Language.Defaults.create_vocab`.      |
| `make_doc`  | callable   | A function that takes text and returns a `Doc` object. Usually a `Tokenizer`.              |
| `meta`      | dict       | Custom meta data for the `Language` class. Is written to by models to add model meta data. |
| **RETURNS** | `Language` | The newly constructed object.                                                              |

## Language.\_\_call\_\_ {#call tag="method"}

Apply the pipeline to some text. The text can span multiple sentences, and can
contain arbitrary whitespace. Alignment into the original string is preserved.

> #### Example
>
> ```python
> doc = nlp(u"An example sentence. Another sentence.")
> assert (doc[0].text, doc[0].head.tag_) == ("An", "NN")
> ```

| Name        | Type    | Description                                                                       |
| ----------- | ------- | --------------------------------------------------------------------------------- |
| `text`      | unicode | The text to be processed.                                                         |
| `disable`   | list    | Names of pipeline components to [disable](/usage/processing-pipelines#disabling). |
| **RETURNS** | `Doc`   | A container for accessing the annotations.                                        |

<Infobox title="Changed in v2.0" variant="warning">

Pipeline components to prevent from being loaded can now be added as a list to
`disable`, instead of specifying one keyword argument per component.

```diff
- doc = nlp(u"I don't want parsed", parse=False)
+ doc = nlp(u"I don't want parsed", disable=["parser"])
```

</Infobox>

## Language.pipe {#pipe tag="method"}

Process texts as a stream, and yield `Doc` objects in order. This is usually
more efficient than processing texts one-by-one.

<Infobox title="Important note for spaCy v2.0.x" variant="danger">

Early versions of spaCy used simple statistical models that could be efficiently
multi-threaded, as we were able to entirely release Python's global interpreter
lock. The multi-threading was controlled using the `n_threads` keyword argument
to the `.pipe` method. This keyword argument is now deprecated as of v2.1.0.
Future versions may introduce a `n_process` argument for parallel inference via
multiprocessing.

</Infobox>

> #### Example
>
> ```python
> texts = [u"One document.", u"...", u"Lots of documents"]
> for doc in nlp.pipe(texts, batch_size=50):
>     assert doc.is_parsed
> ```

| Name         | Type  | Description                                                                                                                                                |
| ------------ | ----- | ---------------------------------------------------------------------------------------------------------------------------------------------------------- |
| `texts`      | -     | A sequence of unicode objects.                                                                                                                             |
| `as_tuples`  | bool  | If set to `True`, inputs should be a sequence of `(text, context)` tuples. Output will then be a sequence of `(doc, context)` tuples. Defaults to `False`. |
| `batch_size` | int   | The number of texts to buffer.                                                                                                                             |
| `disable`    | list  | Names of pipeline components to [disable](/usage/processing-pipelines#disabling).                                                                          |
| **YIELDS**   | `Doc` | Documents in the order of the original text.                                                                                                               |

## Language.update {#update tag="method"}

Update the models in the pipeline.

> #### Example
>
> ```python
> for raw_text, entity_offsets in train_data:
>     doc = nlp.make_doc(raw_text)
>     gold = GoldParse(doc, entities=entity_offsets)
>     nlp.update([doc], [gold], drop=0.5, sgd=optimizer)
> ```

| Name        | Type     | Description                                                                                                                                                                                                         |
| ----------- | -------- | ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
| `docs`      | iterable | A batch of `Doc` objects or unicode. If unicode, a `Doc` object will be created from the text.                                                                                                                      |
| `golds`     | iterable | A batch of `GoldParse` objects or dictionaries. Dictionaries will be used to create [`GoldParse`](/api/goldparse) objects. For the available keys and their usage, see [`GoldParse.__init__`](/api/goldparse#init). |
| `drop`      | float    | The dropout rate.                                                                                                                                                                                                   |
| `sgd`       | callable | An optimizer.                                                                                                                                                                                                       |
| **RETURNS** | dict     | Results from the update.                                                                                                                                                                                            |

## Language.begin_training {#begin_training tag="method"}

Allocate models, pre-process training data and acquire an optimizer.

> #### Example
>
> ```python
> optimizer = nlp.begin_training(gold_tuples)
> ```

| Name          | Type     | Description                  |
| ------------- | -------- | ---------------------------- |
| `gold_tuples` | iterable | Gold-standard training data. |
| `**cfg`       | -        | Config parameters.           |
| **RETURNS**   | callable | An optimizer.                |

## Language.use_params {#use_params tag="contextmanager, method"}

Replace weights of models in the pipeline with those provided in the params
dictionary. Can be used as a context manager, in which case, models go back to
their original weights after the block.

> #### Example
>
> ```python
> with nlp.use_params(optimizer.averages):
>     nlp.to_disk("/tmp/checkpoint")
> ```

| Name     | Type | Description                                   |
| -------- | ---- | --------------------------------------------- |
| `params` | dict | A dictionary of parameters keyed by model ID. |
| `**cfg`  | -    | Config parameters.                            |

## Language.preprocess_gold {#preprocess_gold tag="method"}

Can be called before training to pre-process gold data. By default, it handles
nonprojectivity and adds missing tags to the tag map.

| Name         | Type     | Description                              |
| ------------ | -------- | ---------------------------------------- |
| `docs_golds` | iterable | Tuples of `Doc` and `GoldParse` objects. |
| **YIELDS**   | tuple    | Tuples of `Doc` and `GoldParse` objects. |

## Language.create_pipe {#create_pipe tag="method" new="2"}

Create a pipeline component from a factory.

> #### Example
>
> ```python
> parser = nlp.create_pipe("parser")
> nlp.add_pipe(parser)
> ```

| Name        | Type     | Description                                                                        |
| ----------- | -------- | ---------------------------------------------------------------------------------- |
| `name`      | unicode  | Factory name to look up in [`Language.factories`](/api/language#class-attributes). |
| `config`    | dict     | Configuration parameters to initialize component.                                  |
| **RETURNS** | callable | The pipeline component.                                                            |

## Language.add_pipe {#add_pipe tag="method" new="2"}

Add a component to the processing pipeline. Valid components are callables that
take a `Doc` object, modify it and return it. Only one of `before`, `after`,
`first` or `last` can be set. Default behavior is `last=True`.

> #### Example
>
> ```python
> def component(doc):
>     # modify Doc and return it return doc
>
> nlp.add_pipe(component, before="ner")
> nlp.add_pipe(component, name="custom_name", last=True)
> ```

| Name        | Type     | Description                                                                                                                                                                                                                                            |
| ----------- | -------- | ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------ |
| `component` | callable | The pipeline component.                                                                                                                                                                                                                                |
| `name`      | unicode  | Name of pipeline component. Overwrites existing `component.name` attribute if available. If no `name` is set and the component exposes no name attribute, `component.__name__` is used. An error is raised if the name already exists in the pipeline. |
| `before`    | unicode  | Component name to insert component directly before.                                                                                                                                                                                                    |
| `after`     | unicode  | Component name to insert component directly after:                                                                                                                                                                                                     |
| `first`     | bool     | Insert component first / not first in the pipeline.                                                                                                                                                                                                    |
| `last`      | bool     | Insert component last / not last in the pipeline.                                                                                                                                                                                                      |

## Language.has_pipe {#has_pipe tag="method" new="2"}

Check whether a component is present in the pipeline. Equivalent to
`name in nlp.pipe_names`.

> #### Example
>
> ```python
> nlp.add_pipe(lambda doc: doc, name="component")
> assert "component" in nlp.pipe_names
> assert nlp.has_pipe("component")
> ```

| Name        | Type    | Description                                              |
| ----------- | ------- | -------------------------------------------------------- |
| `name`      | unicode | Name of the pipeline component to check.                 |
| **RETURNS** | bool    | Whether a component of that name exists in the pipeline. |

## Language.get_pipe {#get_pipe tag="method" new="2"}

Get a pipeline component for a given component name.

> #### Example
>
> ```python
> parser = nlp.get_pipe("parser")
> custom_component = nlp.get_pipe("custom_component")
> ```

| Name        | Type     | Description                            |
| ----------- | -------- | -------------------------------------- |
| `name`      | unicode  | Name of the pipeline component to get. |
| **RETURNS** | callable | The pipeline component.                |

## Language.replace_pipe {#replace_pipe tag="method" new="2"}

Replace a component in the pipeline.

> #### Example
>
> ```python
> nlp.replace_pipe("parser", my_custom_parser)
> ```

| Name        | Type     | Description                       |
| ----------- | -------- | --------------------------------- |
| `name`      | unicode  | Name of the component to replace. |
| `component` | callable | The pipeline component to insert. |

## Language.rename_pipe {#rename_pipe tag="method" new="2"}

Rename a component in the pipeline. Useful to create custom names for
pre-defined and pre-loaded components. To change the default name of a component
added to the pipeline, you can also use the `name` argument on
[`add_pipe`](/api/language#add_pipe).

> #### Example
>
> ```python
> nlp.rename_pipe("parser", "spacy_parser")
> ```

| Name       | Type    | Description                      |
| ---------- | ------- | -------------------------------- |
| `old_name` | unicode | Name of the component to rename. |
| `new_name` | unicode | New name of the component.       |

## Language.remove_pipe {#remove_pipe tag="method" new="2"}

Remove a component from the pipeline. Returns the removed component name and
component function.

> #### Example
>
> ```python
> name, component = nlp.remove_pipe("parser")
> assert name == "parser"
> ```

| Name        | Type    | Description                                           |
| ----------- | ------- | ----------------------------------------------------- |
| `name`      | unicode | Name of the component to remove.                      |
| **RETURNS** | tuple   | A `(name, component)` tuple of the removed component. |

## Language.disable_pipes {#disable_pipes tag="contextmanager, method" new="2"}

Disable one or more pipeline components. If used as a context manager, the
pipeline will be restored to the initial state at the end of the block.
Otherwise, a `DisabledPipes` object is returned, that has a `.restore()` method
you can use to undo your changes.

> #### Example
>
> ```python
> with nlp.disable_pipes('tagger', 'parser'):
>     nlp.begin_training()
>
> disabled = nlp.disable_pipes('tagger', 'parser')
> nlp.begin_training()
> disabled.restore()
> ```

| Name        | Type            | Description                                                                          |
| ----------- | --------------- | ------------------------------------------------------------------------------------ |
| `*disabled` | unicode         | Names of pipeline components to disable.                                             |
| **RETURNS** | `DisabledPipes` | The disabled pipes that can be restored by calling the object's `.restore()` method. |

## Language.to_disk {#to_disk tag="method" new="2"}

Save the current state to a directory. If a model is loaded, this will **include
the model**.

> #### Example
>
> ```python
> nlp.to_disk("/path/to/models")
> ```

| Name      | Type             | Description                                                                                                           |
| --------- | ---------------- | --------------------------------------------------------------------------------------------------------------------- |
| `path`    | unicode / `Path` | A path to a directory, which will be created if it doesn't exist. Paths may be either strings or `Path`-like objects. |
| `disable` | list             | Names of pipeline components to [disable](/usage/processing-pipelines#disabling) and prevent from being saved.        |

## Language.from_disk {#from_disk tag="method" new="2"}

Loads state from a directory. Modifies the object in place and returns it. If
the saved `Language` object contains a model, the model will be loaded. Note
that this method is commonly used via the subclasses like `English` or `German`
to make language-specific functionality like the
[lexical attribute getters](/usage/adding-languages#lex-attrs) available to the
loaded object.

> #### Example
>
> ```python
> from spacy.language import Language
> nlp = Language().from_disk("/path/to/model")
>
> # using language-specific subclass
> from spacy.lang.en import English
> nlp = English().from_disk("/path/to/en_model")
> ```

| Name        | Type             | Description                                                                       |
| ----------- | ---------------- | --------------------------------------------------------------------------------- |
| `path`      | unicode / `Path` | A path to a directory. Paths may be either strings or `Path`-like objects.        |
| `disable`   | list             | Names of pipeline components to [disable](/usage/processing-pipelines#disabling). |
| **RETURNS** | `Language`       | The modified `Language` object.                                                   |

<Infobox title="Changed in v2.0" variant="warning">

As of spaCy v2.0, the `save_to_directory` method has been renamed to `to_disk`,
to improve consistency across classes. Pipeline components to prevent from being
loaded can now be added as a list to `disable`, instead of specifying one
keyword argument per component.

```diff
- nlp = spacy.load("en", tagger=False, entity=False)
+ nlp = English().from_disk("/model", disable=["tagger', 'ner"])
```

</Infobox>

## Language.to_bytes {#to_bytes tag="method"}

Serialize the current state to a binary string.

> #### Example
>
> ```python
> nlp_bytes = nlp.to_bytes()
> ```

| Name        | Type  | Description                                                                                                         |
| ----------- | ----- | ------------------------------------------------------------------------------------------------------------------- |
| `disable`   | list  | Names of pipeline components to [disable](/usage/processing-pipelines#disabling) and prevent from being serialized. |
| **RETURNS** | bytes | The serialized form of the `Language` object.                                                                       |

## Language.from_bytes {#from_bytes tag="method"}

Load state from a binary string. Note that this method is commonly used via the
subclasses like `English` or `German` to make language-specific functionality
like the [lexical attribute getters](/usage/adding-languages#lex-attrs)
available to the loaded object.

> #### Example
>
> ```python
> from spacy.lang.en import English
> nlp_bytes = nlp.to_bytes()
> nlp2 = English()
> nlp2.from_bytes(nlp_bytes)
> ```

| Name         | Type       | Description                                                                       |
| ------------ | ---------- | --------------------------------------------------------------------------------- |
| `bytes_data` | bytes      | The data to load from.                                                            |
| `disable`    | list       | Names of pipeline components to [disable](/usage/processing-pipelines#disabling). |
| **RETURNS**  | `Language` | The `Language` object.                                                            |

<Infobox title="Changed in v2.0" variant="warning">

Pipeline components to prevent from being loaded can now be added as a list to
`disable`, instead of specifying one keyword argument per component.

```diff
- nlp = English().from_bytes(bytes, tagger=False, entity=False)
+ nlp = English().from_bytes(bytes, disable=["tagger", "ner"])
```

</Infobox>

## Attributes {#attributes}

| Name                                    | Type               | Description                                                                                     |
| --------------------------------------- | ------------------ | ----------------------------------------------------------------------------------------------- |
| `vocab`                                 | `Vocab`            | A container for the lexical types.                                                              |
| `tokenizer`                             | `Tokenizer`        | The tokenizer.                                                                                  |
| `make_doc`                              | `lambda text: Doc` | Create a `Doc` object from unicode text.                                                        |
| `pipeline`                              | list               | List of `(name, component)` tuples describing the current processing pipeline, in order.        |
| `pipe_names` <Tag variant="new">2</Tag> | list               | List of pipeline component names, in order.                                                     |
| `meta`                                  | dict               | Custom meta data for the Language class. If a model is loaded, contains meta data of the model. |
| `path` <Tag variant="new">2</Tag>       | `Path`             | Path to the model data directory, if a model is loaded. Otherwise `None`.                       |

## Class attributes {#class-attributes}

| Name                                   | Type    | Description                                                                                                                         |
| -------------------------------------- | ------- | ----------------------------------------------------------------------------------------------------------------------------------- |
| `Defaults`                             | class   | Settings, data and factory methods for creating the `nlp` object and processing pipeline.                                           |
| `lang`                                 | unicode | Two-letter language ID, i.e. [ISO code](https://en.wikipedia.org/wiki/List_of_ISO_639-1_codes).                                     |
| `factories` <Tag variant="new">2</Tag> | dict    | Factories that create pre-defined pipeline components, e.g. the tagger, parser or entity recognizer, keyed by their component name. |
💫 Update website (#3285) <!--- Provide a general summary of your changes in the title. --> ## Description The new website is implemented using [Gatsby](https://www.gatsbyjs.org) with [Remark](https://github.com/remarkjs/remark) and [MDX](https://mdxjs.com/). This allows authoring content in straightforward Markdown without the usual limitations. Standard elements can be overwritten with powerful [React](http://reactjs.org/) components and wherever Markdown syntax isn't enough, JSX components can be used. Hopefully, this update will also make it much easier to contribute to the docs. Once this PR is merged, I'll implement auto-deployment via [Netlify](https://netlify.com) on a specific branch (to avoid building the website on every PR). There's a bunch of other cool stuff that the new setup will allow us to do – including writing front-end tests, service workers, offline support, implementing a search and so on. This PR also includes various new docs pages and content. Resolves #3270. Resolves #3222. Resolves #2947. Resolves #2837. ### Types of change enhancement ## Checklist <!--- Before you submit the PR, go over this checklist and make sure you can tick off all the boxes. [] -> [x] --> - [x] I have submitted the spaCy Contributor Agreement. - [x] I ran the tests, and all new and existing tests passed. - [x] My changes don't require a change to the documentation, or if they do, I've added all required information. 2019-02-17 21:31:19 +03:00			`---`
			`title: Language`
			`teaser: A text-processing pipeline`
			`tag: class`
			`source: spacy/language.py`
			`---`

			Usually you'll load this once per process as `nlp` and pass the instance around
			your application. The `Language` class is created when you call
			[`spacy.load()`](/api/top-level#spacy.load) and contains the shared vocabulary
			`and [language data](/usage/adding-languages), optional model data loaded from a`
			`[model package](/models) or a path, and a`
			`[processing pipeline](/usage/processing-pipelines) containing components like`
			`the tagger or parser that are called on a document in order. You can also add`
			your own processing pipeline components that take a `Doc` object, modify it and
			`return it.`

			`## Language.\_\_init\_\_ {#init tag="method"}`

			Initialize a `Language` object.

			`> #### Example`
			`>`
			> ```python
			`> from spacy.vocab import Vocab`
			`> from spacy.language import Language`
			`> nlp = Language(Vocab())`
			`>`
			`> from spacy.lang.en import English`
			`> nlp = English()`
			> ```

			`\| Name \| Type \| Description \|`
			`\| ----------- \| ---------- \| ------------------------------------------------------------------------------------------ \|`
			\| `vocab` \| `Vocab` \| A `Vocab` object. If `True`, a vocab is created via `Language.Defaults.create_vocab`. \|
			\| `make_doc` \| callable \| A function that takes text and returns a `Doc` object. Usually a `Tokenizer`. \|
			\| `meta` \| dict \| Custom meta data for the `Language` class. Is written to by models to add model meta data. \|
			\| RETURNS \| `Language` \| The newly constructed object. \|

			`## Language.\_\_call\_\_ {#call tag="method"}`

			`Apply the pipeline to some text. The text can span multiple sentences, and can`
			`contain arbitrary whitespace. Alignment into the original string is preserved.`

			`> #### Example`
			`>`
			> ```python
			`> doc = nlp(u"An example sentence. Another sentence.")`
			`> assert (doc[0].text, doc[0].head.tag_) == ("An", "NN")`
			> ```

			`\| Name \| Type \| Description \|`
			`\| ----------- \| ------- \| --------------------------------------------------------------------------------- \|`
			\| `text` \| unicode \| The text to be processed. \|
			\| `disable` \| list \| Names of pipeline components to [disable](/usage/processing-pipelines#disabling). \|
			\| RETURNS \| `Doc` \| A container for accessing the annotations. \|

			`<Infobox title="Changed in v2.0" variant="warning">`

			`Pipeline components to prevent from being loaded can now be added as a list to`
			`disable`, instead of specifying one keyword argument per component.

			```diff
			`- doc = nlp(u"I don't want parsed", parse=False)`
			`+ doc = nlp(u"I don't want parsed", disable=["parser"])`
			```

			`</Infobox>`

			`## Language.pipe {#pipe tag="method"}`

			Process texts as a stream, and yield `Doc` objects in order. This is usually
			`more efficient than processing texts one-by-one.`

			`<Infobox title="Important note for spaCy v2.0.x" variant="danger">`

			`Early versions of spaCy used simple statistical models that could be efficiently`
			`multi-threaded, as we were able to entirely release Python's global interpreter`
			lock. The multi-threading was controlled using the `n_threads` keyword argument
			to the `.pipe` method. This keyword argument is now deprecated as of v2.1.0.
			Future versions may introduce a `n_process` argument for parallel inference via
			`multiprocessing.`

			`</Infobox>`

			`> #### Example`
			`>`
			> ```python
			`> texts = [u"One document.", u"...", u"Lots of documents"]`
			`> for doc in nlp.pipe(texts, batch_size=50):`
			`> assert doc.is_parsed`
			> ```

			`\| Name \| Type \| Description \|`
			`\| ------------ \| ----- \| ---------------------------------------------------------------------------------------------------------------------------------------------------------- \|`
			\| `texts` \| - \| A sequence of unicode objects. \|
			\| `as_tuples` \| bool \| If set to `True`, inputs should be a sequence of `(text, context)` tuples. Output will then be a sequence of `(doc, context)` tuples. Defaults to `False`. \|
			\| `batch_size` \| int \| The number of texts to buffer. \|
			\| `disable` \| list \| Names of pipeline components to [disable](/usage/processing-pipelines#disabling). \|
			\| YIELDS \| `Doc` \| Documents in the order of the original text. \|

			`## Language.update {#update tag="method"}`

			`Update the models in the pipeline.`

			`> #### Example`
			`>`
			> ```python
			`> for raw_text, entity_offsets in train_data:`
			`> doc = nlp.make_doc(raw_text)`
			`> gold = GoldParse(doc, entities=entity_offsets)`
			`> nlp.update([doc], [gold], drop=0.5, sgd=optimizer)`
			> ```

			`\| Name \| Type \| Description \|`
			`\| ----------- \| -------- \| ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- \|`
			\| `docs` \| iterable \| A batch of `Doc` objects or unicode. If unicode, a `Doc` object will be created from the text. \|
			\| `golds` \| iterable \| A batch of `GoldParse` objects or dictionaries. Dictionaries will be used to create [`GoldParse`](/api/goldparse) objects. For the available keys and their usage, see [`GoldParse.__init__`](/api/goldparse#init). \|
			\| `drop` \| float \| The dropout rate. \|
			\| `sgd` \| callable \| An optimizer. \|
			`\| RETURNS \| dict \| Results from the update. \|`

			`## Language.begin_training {#begin_training tag="method"}`

			`Allocate models, pre-process training data and acquire an optimizer.`

			`> #### Example`
			`>`
			> ```python
			`> optimizer = nlp.begin_training(gold_tuples)`
			> ```

			`\| Name \| Type \| Description \|`
			`\| ------------- \| -------- \| ---------------------------- \|`
			\| `gold_tuples` \| iterable \| Gold-standard training data. \|
			\| `**cfg` \| - \| Config parameters. \|
			`\| RETURNS \| callable \| An optimizer. \|`

			`## Language.use_params {#use_params tag="contextmanager, method"}`

			`Replace weights of models in the pipeline with those provided in the params`
			`dictionary. Can be used as a context manager, in which case, models go back to`
			`their original weights after the block.`

			`> #### Example`
			`>`
			> ```python
			`> with nlp.use_params(optimizer.averages):`
			`> nlp.to_disk("/tmp/checkpoint")`
			> ```

			`\| Name \| Type \| Description \|`
			`\| -------- \| ---- \| --------------------------------------------- \|`
			\| `params` \| dict \| A dictionary of parameters keyed by model ID. \|
			\| `**cfg` \| - \| Config parameters. \|

			`## Language.preprocess_gold {#preprocess_gold tag="method"}`

			`Can be called before training to pre-process gold data. By default, it handles`
			`nonprojectivity and adds missing tags to the tag map.`

			`\| Name \| Type \| Description \|`
			`\| ------------ \| -------- \| ---------------------------------------- \|`
			\| `docs_golds` \| iterable \| Tuples of `Doc` and `GoldParse` objects. \|
			\| YIELDS \| tuple \| Tuples of `Doc` and `GoldParse` objects. \|

			`## Language.create_pipe {#create_pipe tag="method" new="2"}`

			`Create a pipeline component from a factory.`

			`> #### Example`
			`>`
			> ```python
			`> parser = nlp.create_pipe("parser")`
			`> nlp.add_pipe(parser)`
			> ```

			`\| Name \| Type \| Description \|`
			`\| ----------- \| -------- \| ---------------------------------------------------------------------------------- \|`
			\| `name` \| unicode \| Factory name to look up in [`Language.factories`](/api/language#class-attributes). \|
			\| `config` \| dict \| Configuration parameters to initialize component. \|
			`\| RETURNS \| callable \| The pipeline component. \|`

			`## Language.add_pipe {#add_pipe tag="method" new="2"}`

			`Add a component to the processing pipeline. Valid components are callables that`
			take a `Doc` object, modify it and return it. Only one of `before`, `after`,
			`first` or `last` can be set. Default behavior is `last=True`.

			`> #### Example`
			`>`
			> ```python
			`> def component(doc):`
			`> # modify Doc and return it return doc`
			`>`
			`> nlp.add_pipe(component, before="ner")`
			`> nlp.add_pipe(component, name="custom_name", last=True)`
			> ```

			`\| Name \| Type \| Description \|`
			`\| ----------- \| -------- \| ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------ \|`
			\| `component` \| callable \| The pipeline component. \|
			\| `name` \| unicode \| Name of pipeline component. Overwrites existing `component.name` attribute if available. If no `name` is set and the component exposes no name attribute, `component.__name__` is used. An error is raised if the name already exists in the pipeline. \|
			\| `before` \| unicode \| Component name to insert component directly before. \|
			\| `after` \| unicode \| Component name to insert component directly after: \|
			\| `first` \| bool \| Insert component first / not first in the pipeline. \|
			\| `last` \| bool \| Insert component last / not last in the pipeline. \|

			`## Language.has_pipe {#has_pipe tag="method" new="2"}`

			`Check whether a component is present in the pipeline. Equivalent to`
			`name in nlp.pipe_names`.

			`> #### Example`
			`>`
			> ```python
			`> nlp.add_pipe(lambda doc: doc, name="component")`
			`> assert "component" in nlp.pipe_names`
			`> assert nlp.has_pipe("component")`
			> ```

			`\| Name \| Type \| Description \|`
			`\| ----------- \| ------- \| -------------------------------------------------------- \|`
			\| `name` \| unicode \| Name of the pipeline component to check. \|
			`\| RETURNS \| bool \| Whether a component of that name exists in the pipeline. \|`

			`## Language.get_pipe {#get_pipe tag="method" new="2"}`

			`Get a pipeline component for a given component name.`

			`> #### Example`
			`>`
			> ```python
			`> parser = nlp.get_pipe("parser")`
			`> custom_component = nlp.get_pipe("custom_component")`
			> ```

			`\| Name \| Type \| Description \|`
			`\| ----------- \| -------- \| -------------------------------------- \|`
			\| `name` \| unicode \| Name of the pipeline component to get. \|
			`\| RETURNS \| callable \| The pipeline component. \|`

			`## Language.replace_pipe {#replace_pipe tag="method" new="2"}`

			`Replace a component in the pipeline.`

			`> #### Example`
			`>`
			> ```python
			`> nlp.replace_pipe("parser", my_custom_parser)`
			> ```

			`\| Name \| Type \| Description \|`
			`\| ----------- \| -------- \| --------------------------------- \|`
			\| `name` \| unicode \| Name of the component to replace. \|
			\| `component` \| callable \| The pipeline component to insert. \|

			`## Language.rename_pipe {#rename_pipe tag="method" new="2"}`

			`Rename a component in the pipeline. Useful to create custom names for`
			`pre-defined and pre-loaded components. To change the default name of a component`
			added to the pipeline, you can also use the `name` argument on
			[`add_pipe`](/api/language#add_pipe).

			`> #### Example`
			`>`
			> ```python
			`> nlp.rename_pipe("parser", "spacy_parser")`
			> ```

			`\| Name \| Type \| Description \|`
			`\| ---------- \| ------- \| -------------------------------- \|`
			\| `old_name` \| unicode \| Name of the component to rename. \|
			\| `new_name` \| unicode \| New name of the component. \|

			`## Language.remove_pipe {#remove_pipe tag="method" new="2"}`

			`Remove a component from the pipeline. Returns the removed component name and`
			`component function.`

			`> #### Example`
			`>`
			> ```python
			`> name, component = nlp.remove_pipe("parser")`
			`> assert name == "parser"`
			> ```

			`\| Name \| Type \| Description \|`
			`\| ----------- \| ------- \| ----------------------------------------------------- \|`
			\| `name` \| unicode \| Name of the component to remove. \|
			\| RETURNS \| tuple \| A `(name, component)` tuple of the removed component. \|

			`## Language.disable_pipes {#disable_pipes tag="contextmanager, method" new="2"}`

			`Disable one or more pipeline components. If used as a context manager, the`
			`pipeline will be restored to the initial state at the end of the block.`
			Otherwise, a `DisabledPipes` object is returned, that has a `.restore()` method
			`you can use to undo your changes.`

			`> #### Example`
			`>`
			> ```python
			`> with nlp.disable_pipes('tagger', 'parser'):`
			`> nlp.begin_training()`
			`>`
			`> disabled = nlp.disable_pipes('tagger', 'parser')`
			`> nlp.begin_training()`
			`> disabled.restore()`
			> ```

			`\| Name \| Type \| Description \|`
			`\| ----------- \| --------------- \| ------------------------------------------------------------------------------------ \|`
			\| `*disabled` \| unicode \| Names of pipeline components to disable. \|
			\| RETURNS \| `DisabledPipes` \| The disabled pipes that can be restored by calling the object's `.restore()` method. \|

			`## Language.to_disk {#to_disk tag="method" new="2"}`

			`Save the current state to a directory. If a model is loaded, this will **include`
			`the model**.`

			`> #### Example`
			`>`
			> ```python
			`> nlp.to_disk("/path/to/models")`
			> ```

			`\| Name \| Type \| Description \|`
			`\| --------- \| ---------------- \| --------------------------------------------------------------------------------------------------------------------- \|`
			\| `path` \| unicode / `Path` \| A path to a directory, which will be created if it doesn't exist. Paths may be either strings or `Path`-like objects. \|
			\| `disable` \| list \| Names of pipeline components to [disable](/usage/processing-pipelines#disabling) and prevent from being saved. \|

			`## Language.from_disk {#from_disk tag="method" new="2"}`

			`Loads state from a directory. Modifies the object in place and returns it. If`
			the saved `Language` object contains a model, the model will be loaded. Note
			that this method is commonly used via the subclasses like `English` or `German`
			`to make language-specific functionality like the`
			`[lexical attribute getters](/usage/adding-languages#lex-attrs) available to the`
			`loaded object.`

			`> #### Example`
			`>`
			> ```python
			`> from spacy.language import Language`
			`> nlp = Language().from_disk("/path/to/model")`
			`>`
			`> # using language-specific subclass`
			`> from spacy.lang.en import English`
			`> nlp = English().from_disk("/path/to/en_model")`
			> ```

			`\| Name \| Type \| Description \|`
			`\| ----------- \| ---------------- \| --------------------------------------------------------------------------------- \|`
			\| `path` \| unicode / `Path` \| A path to a directory. Paths may be either strings or `Path`-like objects. \|
			\| `disable` \| list \| Names of pipeline components to [disable](/usage/processing-pipelines#disabling). \|
			\| RETURNS \| `Language` \| The modified `Language` object. \|

			`<Infobox title="Changed in v2.0" variant="warning">`

			As of spaCy v2.0, the `save_to_directory` method has been renamed to `to_disk`,
			`to improve consistency across classes. Pipeline components to prevent from being`
			loaded can now be added as a list to `disable`, instead of specifying one
			`keyword argument per component.`

			```diff
			`- nlp = spacy.load("en", tagger=False, entity=False)`
			`+ nlp = English().from_disk("/model", disable=["tagger', 'ner"])`
			```

			`</Infobox>`

			`## Language.to_bytes {#to_bytes tag="method"}`

			`Serialize the current state to a binary string.`

			`> #### Example`
			`>`
			> ```python
			`> nlp_bytes = nlp.to_bytes()`
			> ```

			`\| Name \| Type \| Description \|`
			`\| ----------- \| ----- \| ------------------------------------------------------------------------------------------------------------------- \|`
			\| `disable` \| list \| Names of pipeline components to [disable](/usage/processing-pipelines#disabling) and prevent from being serialized. \|
			\| RETURNS \| bytes \| The serialized form of the `Language` object. \|

			`## Language.from_bytes {#from_bytes tag="method"}`

			`Load state from a binary string. Note that this method is commonly used via the`
			subclasses like `English` or `German` to make language-specific functionality
			`like the [lexical attribute getters](/usage/adding-languages#lex-attrs)`
			`available to the loaded object.`

			`> #### Example`
			`>`
			> ```python
			`> from spacy.lang.en import English`
			`> nlp_bytes = nlp.to_bytes()`
			`> nlp2 = English()`
			`> nlp2.from_bytes(nlp_bytes)`
			> ```

			`\| Name \| Type \| Description \|`
			`\| ------------ \| ---------- \| --------------------------------------------------------------------------------- \|`
			\| `bytes_data` \| bytes \| The data to load from. \|
			\| `disable` \| list \| Names of pipeline components to [disable](/usage/processing-pipelines#disabling). \|
			\| RETURNS \| `Language` \| The `Language` object. \|

			`<Infobox title="Changed in v2.0" variant="warning">`

			`Pipeline components to prevent from being loaded can now be added as a list to`
			`disable`, instead of specifying one keyword argument per component.

			```diff
			`- nlp = English().from_bytes(bytes, tagger=False, entity=False)`
			`+ nlp = English().from_bytes(bytes, disable=["tagger", "ner"])`
			```

			`</Infobox>`

			`## Attributes {#attributes}`

			`\| Name \| Type \| Description \|`
			`\| --------------------------------------- \| ------------------ \| ----------------------------------------------------------------------------------------------- \|`
			\| `vocab` \| `Vocab` \| A container for the lexical types. \|
			\| `tokenizer` \| `Tokenizer` \| The tokenizer. \|
			\| `make_doc` \| `lambda text: Doc` \| Create a `Doc` object from unicode text. \|
			\| `pipeline` \| list \| List of `(name, component)` tuples describing the current processing pipeline, in order. \|
			\| `pipe_names` <Tag variant="new">2</Tag> \| list \| List of pipeline component names, in order. \|
			\| `meta` \| dict \| Custom meta data for the Language class. If a model is loaded, contains meta data of the model. \|
			\| `path` <Tag variant="new">2</Tag> \| `Path` \| Path to the model data directory, if a model is loaded. Otherwise `None`. \|

			`## Class attributes {#class-attributes}`

			`\| Name \| Type \| Description \|`
			`\| -------------------------------------- \| ------- \| ----------------------------------------------------------------------------------------------------------------------------------- \|`
			\| `Defaults` \| class \| Settings, data and factory methods for creating the `nlp` object and processing pipeline. \|
			\| `lang` \| unicode \| Two-letter language ID, i.e. [ISO code](https://en.wikipedia.org/wiki/List_of_ISO_639-1_codes). \|
			\| `factories` <Tag variant="new">2</Tag> \| dict \| Factories that create pre-defined pipeline components, e.g. the tagger, parser or entity recognizer, keyed by their component name. \|