spaCy

mirror of https://github.com/explosion/spaCy.git synced 2025-11-25 12:25:51 +03:00

History

Ines Montani df19e2bff6 💫 Allow setting of custom attributes during retokenization (closes #3314 ) (#3324 ) <!--- Provide a general summary of your changes in the title. --> ## Description This PR adds the abilility to override custom extension attributes during merging. This will only work for attributes that are writable, i.e. attributes registered with a default value like `default=False` or attribute that have both a getter and a setter implemented. ```python Token.set_extension('is_musician', default=False) doc = nlp("I like David Bowie.") with doc.retokenize() as retokenizer: attrs = {"LEMMA": "David Bowie", "_": {"is_musician": True}} retokenizer.merge(doc[2:4], attrs=attrs) assert doc[2].text == "David Bowie" assert doc[2].lemma_ == "David Bowie" assert doc[2]._.is_musician ``` ### Types of change enhancement ## Checklist <!--- Before you submit the PR, go over this checklist and make sure you can tick off all the boxes. [] -> [x] --> - [x] I have submitted the spaCy Contributor Agreement. - [x] I ran the tests, and all new and existing tests passed. - [x] My changes don't require a change to the documentation, or if they do, I've added all required information.		2019-02-24 18:38:47 +01:00
..
__init__.py	Add __init__.py file for regression tests	2016-11-01 13:45:06 +01:00
_test_issue1622.py	Tidy up regression tests	2019-02-08 15:51:13 +01:00
_test_issue2800.py	Tidy up and format remaining files	2018-11-30 17:43:08 +01:00
test_issue1-1000.py	Clean up of char classes, few tokenizer fixes and faster default French tokenizer (#3293 )	2019-02-20 22:10:13 +01:00
test_issue1001-1500.py	Clean up of char classes, few tokenizer fixes and faster default French tokenizer (#3293 )	2019-02-20 22:10:13 +01:00
test_issue1501-2000.py	💫 Replace {Doc,Span}.merge with Doc.retokenize (#3280 )	2019-02-15 10:29:44 +01:00
test_issue1971.py	💫 Allow setting of custom attributes during retokenization (closes #3314 ) (#3324 )	2019-02-24 18:38:47 +01:00
test_issue2001-2500.py	Clean up of char classes, few tokenizer fixes and faster default French tokenizer (#3293 )	2019-02-20 22:10:13 +01:00
test_issue2501-3000.py	Tidy up regression tests	2019-02-08 15:51:13 +01:00
test_issue2656.py	Clean up of char classes, few tokenizer fixes and faster default French tokenizer (#3293 )	2019-02-20 22:10:13 +01:00
test_issue2728.py	Fix escaping of HTML in displacy ENT (closes #2728 )	2019-02-21 14:30:39 +01:00
test_issue2822.py	Clean up of char classes, few tokenizer fixes and faster default French tokenizer (#3293 )	2019-02-20 22:10:13 +01:00
test_issue2833.py	Also raise error in Span.__reduce__	2019-02-13 13:22:05 +01:00
test_issue2926.py	Clean up of char classes, few tokenizer fixes and faster default French tokenizer (#3293 )	2019-02-20 22:10:13 +01:00
test_issue3002.py	Clean up of char classes, few tokenizer fixes and faster default French tokenizer (#3293 )	2019-02-20 22:10:13 +01:00
test_issue3009.py	Tidy up and fix small bugs and typos	2019-02-08 14:14:49 +01:00
test_issue3012.py	Tidy up and fix small bugs and typos	2019-02-08 14:14:49 +01:00
test_issue3199.py	Only run noun chunks iterator in Span if available (closes #3199 )	2019-02-08 18:33:16 +01:00
test_issue3209.py	Tidy up and auto-format	2019-02-13 15:29:08 +01:00
test_issue3248.py	Tidy up and auto-format	2019-02-13 15:29:08 +01:00
test_issue3277.py	💫 Add en/em dash to prefixes and suffixes (#3281 )	2019-02-15 10:29:59 +01:00
test_issue3288.py	Tidy up tests	2019-02-24 14:11:23 +01:00
test_issue3289.py	Tidy up tests	2019-02-24 14:11:23 +01:00