spaCy/spacy/tokens
Madeesh Kannan 41389ffe1e
Avoid pickling Doc inputs passed to Language.pipe() (#10864)
* `Language.pipe()`: Serialize `Doc` objects to bytes when using multiprocessing to avoid pickling overhead

* `Doc.to_dict()`: Serialize `_context` attribute (keeping in line with `(un)pickle_doc()`

* Correct type annotations

* Fix typo

* `Doc`: Do not serialize `_context`

* `Language.pipe`: Send context objects to child processes, Simplify `as_tuples` handling

* Fix type annotation

* `Language.pipe`: Simplify `as_tuple` multiprocessor handling

* Cleanup code, fix typos

* MyPy fixes

* Move doc preparation function into `_multiprocessing_pipe`
Whitespace changes

* Remove superfluous comma

* Rename `prepare_doc` to `prepare_input`

* Update spacy/errors.py

* Undo renaming for error

Co-authored-by: Adriane Boyd <adrianeboyd@gmail.com>
2022-06-02 20:06:49 +02:00
..
__init__.pxd * Break up tokens.pyx into tokens/doc.pyx, tokens/token.pyx, tokens/spans.pyx 2015-07-13 20:20:58 +02:00
__init__.py Fix SpanGroup import (#7182) 2021-02-24 21:06:16 +11:00
_dict_proxies.py Fix: De/Serialize SpanGroups including the SpanGroup keys (#10707) 2022-06-02 15:56:27 +02:00
_retokenize.pyi 🏷 Add Mypy check to CI and ignore all existing Mypy errors (#9167) 2021-10-14 15:21:40 +02:00
_retokenize.pyx Fix tensor retokenization for non-numpy ops (#7527) 2021-03-29 22:34:48 +11:00
_serialize.py Maintain support for empty DocBin span groups (#10538) 2022-03-24 11:51:07 +01:00
doc.pxd Set as_tuples on Doc during processing (#9592) 2021-11-02 15:08:22 +01:00
doc.pyi Add Doc.from_json() (#10688) 2022-06-02 14:03:47 +02:00
doc.pyx Avoid pickling Doc inputs passed to Language.pipe() (#10864) 2022-06-02 20:06:49 +02:00
graph.pxd Add SpanGroup and Graph container types to represent arbitrary annotations (#6696) 2021-01-14 17:30:41 +11:00
graph.pyx Refactor error messages to remove hardcoded strings (#10729) 2022-05-02 13:38:46 +02:00
morphanalysis.pxd Modify morphology to support arbitrary features (#4932) 2020-01-23 22:01:54 +01:00
morphanalysis.pyi 🏷 Add Mypy check to CI and ignore all existing Mypy errors (#9167) 2021-10-14 15:21:40 +02:00
morphanalysis.pyx Minor refactor for Morphology and MorphAnalysis (#5804) 2020-07-24 09:28:06 +02:00
span_group.pxd Add SpanGroup and Graph container types to represent arbitrary annotations (#6696) 2021-01-14 17:30:41 +11:00
span_group.pyi Fix: De/Serialize SpanGroups including the SpanGroup keys (#10707) 2022-06-02 15:56:27 +02:00
span_group.pyx Fix a few minor bugs in the SpanGroup API web docs (#10650) 2022-04-14 09:59:48 +02:00
span.pxd Add SpanGroup and Graph container types to represent arbitrary annotations (#6696) 2021-01-14 17:30:41 +11:00
span.pyi Add SpanRuler component (#9880) 2022-06-02 13:12:53 +02:00
span.pyx Add SpanRuler component (#9880) 2022-06-02 13:12:53 +02:00
token.pxd cleanup 2021-01-13 14:20:05 +01:00
token.pyi 🏷 Add Mypy check to CI and ignore all existing Mypy errors (#9167) 2021-10-14 15:21:40 +02:00
token.pyx Token sent attributes more consistent (#10164) 2022-02-08 08:35:37 +01:00
underscore.py Update typing hints (#10109) 2022-01-28 16:59:54 +01:00