mirror of
https://github.com/explosion/spaCy.git
synced 2025-10-22 11:44:16 +03:00
<!--- Provide a general summary of your changes in the title. -->
## Description
This PR adds the abilility to override custom extension attributes during merging. This will only work for attributes that are writable, i.e. attributes registered with a default value like `default=False` or attribute that have both a getter *and* a setter implemented.
```python
Token.set_extension('is_musician', default=False)
doc = nlp("I like David Bowie.")
with doc.retokenize() as retokenizer:
attrs = {"LEMMA": "David Bowie", "_": {"is_musician": True}}
retokenizer.merge(doc[2:4], attrs=attrs)
assert doc[2].text == "David Bowie"
assert doc[2].lemma_ == "David Bowie"
assert doc[2]._.is_musician
```
### Types of change
enhancement
## Checklist
<!--- Before you submit the PR, go over this checklist and make sure you can
tick off all the boxes. [] -> [x] -->
- [x] I have submitted the spaCy Contributor Agreement.
- [x] I ran the tests, and all new and existing tests passed.
- [x] My changes don't require a change to the documentation, or if they do, I've added all required information.
|
||
|---|---|---|
| .. | ||
| annotation.md | ||
| cli.md | ||
| cython-classes.md | ||
| cython-structs.md | ||
| cython.md | ||
| dependencyparser.md | ||
| doc.md | ||
| entityrecognizer.md | ||
| entityruler.md | ||
| goldcorpus.md | ||
| goldparse.md | ||
| index.md | ||
| language.md | ||
| lemmatizer.md | ||
| lexeme.md | ||
| matcher.md | ||
| phrasematcher.md | ||
| pipeline-functions.md | ||
| sentencesegmenter.md | ||
| span.md | ||
| stringstore.md | ||
| tagger.md | ||
| textcategorizer.md | ||
| token.md | ||
| tokenizer.md | ||
| top-level.md | ||
| vectors.md | ||
| vocab.md | ||