spaCy/spacy/tokens
Matthew Honnibal 63b7accd74
💫 Make span.as_doc() return a copy, not a view. Closes #1537 (#3107)
Initially span.as_doc() was designed to return a view of the span's contents, as a Doc object. This was a nice idea, but it fails due to the token.idx property, which refers to the character offset within the string. In a span, the idx of the first token might not be 0. Because this data is different, we can't have a view --- it'll be inconsistent.

This patch changes span.as_doc() to instead return a copy. The docs are updated accordingly. Closes #1537

* Update test for span.as_doc()

* Make span.as_doc() return a copy. Closes #1537

* Document change to Span.as_doc()
2018-12-30 15:17:46 +01:00
..
__init__.pxd * Break up tokens.pyx into tokens/doc.pyx, tokens/token.pyx, tokens/spans.pyx 2015-07-13 20:20:58 +02:00
__init__.py Tidy up and document Doc, Token and Span 2017-10-27 15:41:45 +02:00
_retokenize.pyx Resize doc.tensor when merging spans. Closes #1963 (#3106) 2018-12-30 15:17:17 +01:00
_serialize.py 💫 Replace ujson, msgpack and dill/pickle/cloudpickle with srsly (#3003) 2018-12-03 01:28:22 +01:00
doc.pxd Fix issue 2396 (#3089) 2018-12-29 18:05:52 +01:00
doc.pyx Fix issue 2396 (#3089) 2018-12-29 18:05:52 +01:00
span.pxd Add Span.to_array method 2017-08-19 12:20:45 +02:00
span.pyx 💫 Make span.as_doc() return a copy, not a view. Closes #1537 (#3107) 2018-12-30 15:17:46 +01:00
token.pxd Make NORM a token attribute (#3029) 2018-12-08 10:49:10 +01:00
token.pyx Merge branch 'master' into develop 2018-12-18 13:48:10 +01:00
underscore.py 💫 Tidy up and auto-format .py files (#2983) 2018-11-30 17:03:03 +01:00