spaCy

mirror of https://github.com/explosion/spaCy.git synced 2026-02-17 20:50:55 +03:00

History

Matthew Honnibal 6f5e308d17 Support negative examples in partial NER annotations (#8106 ) * Support a cfg field in transition system * Make NER 'has gold' check use right alignment for span * Pass 'negative_samples_key' property into NER transition system * Add field for negative samples to NER transition system * Check neg_key in NER has_gold * Support negative examples in NER oracle * Test for negative examples in NER * Fix name of config variable in NER * Remove vestiges of old-style partial annotation * Remove obsolete tests * Add comment noting lack of support for negative samples in parser * Additions to "neg examples" PR (#8201) * add custom error and test for deprecated format * add test for unlearning an entity * add break also for Begin's cost * add negative_samples_key property on Parser * rename * extend docs & fix some older docs issues * add subclass constructors, clean up tests, fix docs * add flaky test with ValueError if gold parse was not found * remove ValueError if n_gold == 0 * fix docstring * Hack in environment variables to try out training * Remove hack * Remove NER hack, and support 'negative O' samples * Fix O oracle * Fix transition parser * Remove 'not O' from oracle * Fix NER oracle * check for spans in both gold.ents and gold.spans and raise if so, to prevent memory access violation * use set instead of list in consistency check Co-authored-by: svlandeg <sofie.vanlandeghem@gmail.com> Co-authored-by: Sofie Van Landeghem <svlandeg@users.noreply.github.com>		2021-06-17 17:33:00 +10:00
..
converters	Fix parser sourcing in NER converter (#7631 )	2021-04-08 12:25:03 +02:00
__init__.pxd	Renaming gold & annotation_setter (#6042 )	2020-09-09 10:31:03 +02:00
__init__.py	Add callback to copy vocab/tokenizer from model (#7750 )	2021-04-22 12:36:50 +02:00
align.pyx	Fix alignment for 1-to-1 tokens and lowercasing (#6476 )	2020-12-08 14:25:16 +08:00
alignment.py	Replace pytokenizations with internal alignment (#6293 )	2020-11-03 16:24:38 +01:00
augment.py	Fix lowercase augmentation (#7336 )	2021-03-09 14:02:32 +11:00
batchers.py	ensure tolerance is properly passed on (#8158 )	2021-05-27 18:10:28 +10:00
callbacks.py	Add callback to copy vocab/tokenizer from model (#7750 )	2021-04-22 12:36:50 +02:00
corpus.py	Make JsonlReader path optional (#8396 )	2021-06-15 14:55:15 +02:00
example.pxd	Make a pre-check to speed up alignment cache (#6139 )	2020-09-24 18:13:39 +02:00
example.pyx	Support negative examples in partial NER annotations (#8106 )	2021-06-17 17:33:00 +10:00
gold_io.pyx	Fix is_sent_start when converting from JSON (fix #7635 ) (#7655 )	2021-04-08 18:24:52 +10:00
initialize.py	Support large/infinite training corpora (#7208 )	2021-04-08 18:08:04 +10:00
iob_utils.py	fix docs (#8200 )	2021-05-27 10:48:59 +02:00
loggers.py	W&B integration: Optional support for dataset and model checkpoint logging and versioning (#7429 )	2021-04-01 19:36:23 +02:00
loop.py	Add training option to set annotations on update (#7767 )	2021-04-26 16:53:53 +02:00
pretrain.py	replace "is not" with !=	2021-03-18 21:09:11 +01:00