spaCy/spacy/tests/serialize
Matthew Honnibal 6f5e308d17
Support negative examples in partial NER annotations (#8106)
* Support a cfg field in transition system

* Make NER 'has gold' check use right alignment for span

* Pass 'negative_samples_key' property into NER transition system

* Add field for negative samples to NER transition system

* Check neg_key in NER has_gold

* Support negative examples in NER oracle

* Test for negative examples in NER

* Fix name of config variable in NER

* Remove vestiges of old-style partial annotation

* Remove obsolete tests

* Add comment noting lack of support for negative samples in parser

* Additions to "neg examples" PR (#8201)

* add custom error and test for deprecated format

* add test for unlearning an entity

* add break also for Begin's cost

* add negative_samples_key property on Parser

* rename

* extend docs & fix some older docs issues

* add subclass constructors, clean up tests, fix docs

* add flaky test with ValueError if gold parse was not found

* remove ValueError if n_gold == 0

* fix docstring

* Hack in environment variables to try out training

* Remove hack

* Remove NER hack, and support 'negative O' samples

* Fix O oracle

* Fix transition parser

* Remove 'not O' from oracle

* Fix NER oracle

* check for spans in both gold.ents and gold.spans and raise if so, to prevent memory access violation

* use set instead of list in consistency check

Co-authored-by: svlandeg <sofie.vanlandeghem@gmail.com>
Co-authored-by: Sofie Van Landeghem <svlandeg@users.noreply.github.com>
2021-06-17 17:33:00 +10:00
..
__init__.py Revert #4334 2019-09-29 17:32:12 +02:00
test_resource_warning.py Tidy up tests 2020-10-15 10:20:21 +02:00
test_serialize_config.py Ensure hyphen in config file works as string value (#7642) 2021-04-12 14:35:57 +02:00
test_serialize_doc.py Add ENT_ID and NORM to DocBin strings (#8054) 2021-05-17 18:06:11 +10:00
test_serialize_extension_attrs.py Merge branch 'master' into develop 2020-02-18 14:47:23 +01:00
test_serialize_kb.py consistently use registry as callable 2021-03-02 17:56:28 +01:00
test_serialize_language.py Remove dead and/or deprecated code (#5710) 2020-07-06 13:06:25 +02:00
test_serialize_pipeline.py Support negative examples in partial NER annotations (#8106) 2021-06-17 17:33:00 +10:00
test_serialize_tokenizer.py Fix tokenizer cache flushing (#7836) 2021-04-22 18:14:57 +10:00
test_serialize_vocab_strings.py Make vocab update in get_docs deterministic (#7603) 2021-04-09 11:53:13 +02:00