mirror of
https://github.com/explosion/spaCy.git
synced 2024-12-25 09:26:27 +03:00
Clarify serialization of extension attributes (closes #4377) [ci skip]
This commit is contained in:
parent
fec9433044
commit
e65dffd80b
|
@ -46,7 +46,7 @@ Create a `DocBin` object to hold serialized annotations.
|
|||
| Argument | Type | Description |
|
||||
| ----------------- | -------- | ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------ |
|
||||
| `attrs` | list | List of attributes to serialize. `orth` (hash of token text) and `spacy` (whether the token is followed by whitespace) are always serialized, so they're not required. Defaults to `None`. |
|
||||
| `store_user_data` | bool | Whether to include the `Doc.user_data`. Defaults to `False`. |
|
||||
| `store_user_data` | bool | Whether to include the `Doc.user_data` and the values of custom extension attributes. Defaults to `False`. |
|
||||
| **RETURNS** | `DocBin` | The newly constructed object. |
|
||||
|
||||
## DocBin.\_\len\_\_ {#len tag="method"}
|
||||
|
|
|
@ -92,6 +92,25 @@ doc_bin = DocBin().from_bytes(bytes_data)
|
|||
docs = list(doc_bin.get_docs(nlp.vocab))
|
||||
```
|
||||
|
||||
If `store_user_data` is set to `True`, the `Doc.user_data` will be serialized as
|
||||
well, which includes the values of
|
||||
[extension attributes](/processing-pipelines#custom-components-attributes) (if
|
||||
they're serializable with msgpack).
|
||||
|
||||
<Infobox title="Important note on serializing extension attributes" variant="warning">
|
||||
|
||||
Including the `Doc.user_data` and extension attributes will only serialize the
|
||||
**values** of the attributes. To restore the values and access them via the
|
||||
`doc._.` property, you need to register the global attribute on the `Doc` again.
|
||||
|
||||
```python
|
||||
docs = list(doc_bin.get_docs(nlp.vocab))
|
||||
Doc.set_extension("my_custom_attr", default=None)
|
||||
print([doc._.my_custom_attr for doc in docs])
|
||||
```
|
||||
|
||||
</Infobox>
|
||||
|
||||
### Using Pickle {#pickle}
|
||||
|
||||
> #### Example
|
||||
|
|
Loading…
Reference in New Issue
Block a user