spaCy

mirror of https://github.com/explosion/spaCy.git synced 2025-10-02 18:06:46 +03:00

Author	SHA1	Message	Date
Paul O'Leary McCann	f9c82e249c	Update error number This was changed by merge	2022-07-11 20:14:36 +09:00
Paul O'Leary McCann	4d032396b8	Merge branch 'feature/coref' into coref/dimension-inference	2022-07-11 19:18:46 +09:00
Paul O'Leary McCann	6d9eafeb37	Merge branch 'feature/coref' into fix/coref-alignment	2022-07-11 19:14:37 +09:00
Paul O'Leary McCann	f67c1735c5	Remove tok2vec_size from coref	2022-07-06 18:58:57 +09:00
Paul O'Leary McCann	bd17c38b74	It works! Was missing the serialization-related code from biaffine.	2022-07-06 18:58:22 +09:00
Paul O'Leary McCann	ce49136458	Update NotImplementedError for coref component	2022-07-06 17:28:15 +09:00
Paul O'Leary McCann	c4de3e51a2	Remove old TODOs	2022-07-06 17:23:41 +09:00
Paul O'Leary McCann	178feae00a	Add tests to give up with whitespace differences Docs in Examples are allowed to have arbitrarily different whitespace. Handling that properly would be nice but isn't required, but for now check for it and blow up.	2022-07-04 19:37:42 +09:00
Paul O'Leary McCann	79720886fa	Merge branch 'feature/coref' into fix/coref-alignment Had to renumber error message.	2022-07-01 19:09:29 +09:00
Paul O'Leary McCann	d1ff933e9b	Test works This may not be done yet, as the test is just for consistency, and not overfitting correctly yet.	2022-06-28 19:15:33 +09:00
Paul O'Leary McCann	16894e665d	Refactor Coval Scoring code (#10875 ) * Move coref scoring code to scorer.py Includes some renames to make names less generic. * Refactor coval code to remove ternary expressions * Black formatting * Add header * Make scorers into registered scorers * Small test fixes * Skip coref tests when torch not present Coref can't be loaded without Torch, so nothing works. * Fix remaining type issues Some of this just involves ignoring types in thorny areas. Two main issues: 1. Some things have weird types due to indirection/ argskwargs 2. xp2torch return type seems to have changed at some point * Update spacy/scorer.py Co-authored-by: kadarakos <kadar.akos@gmail.com> * Small changes from review * Be specific about the ValueError * Type fix Co-authored-by: kadarakos <kadar.akos@gmail.com>	2022-06-22 16:05:52 +09:00
Paul O'Leary McCann	196886bbca	Fix coref size inference (#10916 ) * Add explicit tok2vec_size parameter in clusterer * Add tok2vec size to span predictor config * Minor fixes	2022-06-08 20:03:41 +09:00
svlandeg	cea40c9d7b	fix types + black formatting	2022-05-25 13:34:09 +02:00
Paul O'Leary McCann	6087da9675	Suggestions from code review, cleanup, typing	2022-05-25 19:11:48 +09:00
Paul O'Leary McCann	2e8f0e9168	Rename coref params	2022-05-16 16:50:10 +09:00
kadarakos	7cf6bcca0e	merge misery	2022-05-10 17:19:16 +00:00
Paul O'Leary McCann	33f4f90ff0	Formatting	2022-05-10 19:09:52 +09:00
Paul O'Leary McCann	f852c5cea4	Split span predictor component into its own file This runs. The imports in both of the split files could probably use a close check to remove extras.	2022-05-10 18:53:45 +09:00
Paul O'Leary McCann	afd255c0ed	Undo multiply by 100 This was mistaken, not sure why my score seemed to be off before.	2022-04-14 18:42:09 +09:00
Paul O'Leary McCann	08729e0fbd	Remove end adjustment The difference in environments was due to a change in Thinc, the code here is fine.	2022-04-14 18:31:30 +09:00
Paul O'Leary McCann	8181d4570c	Multiply accuracy by 100 This seems to match with the scorer expectations better	2022-04-14 15:56:38 +09:00
Paul O'Leary McCann	e8af02700f	Remove all coref scoring exept LEA This is necessary because one of the three old methods relied on scipy for some complex problem solving. LEA is generally better for evaluations. The downside is that this means evaluations aren't comparable with many papers, but canonical scoring can be supported using external eval scripts or other methods.	2022-04-13 21:02:18 +09:00
Paul O'Leary McCann	2300f4df3d	Fix span score logging	2022-04-13 20:37:06 +09:00
Paul O'Leary McCann	d470fa03c1	Adjust end indices It's not clear if this is technically correct or not but it won't run without it for me.	2022-04-13 20:19:21 +09:00
kadarakos	b53113e3b8	Preparing span predictor for predicting from gold (#10547 ) Note this is squashed because rebasing had conflicts. * remove unnecessary .device * span predictor debug start * gearing up SpanPredictor for gold-heads * merge SpanPredictor attributes * remove useless extra prefix and device from spanpredictor * make sure predicted and reference keeps aligned * handle empty head_ids * handle empty clusters * addressing suggestions by @polm * nicer restore * fix score overwriting bug * prepare for aligned heads-spans training * span accuracy score * update with eg.predited as other components * add backprop callback to spanpredictor * report start- and end-accuracies separately * fixing scorer Co-authored-by: Kádár Ákos <akos@onyx.uvt.nl>	2022-04-13 19:42:49 +09:00
Kádár Ákos	6aedd98d02	fixing scorer	2022-04-11 16:10:14 +02:00
Kádár Ákos	7a239f2ec7	report start- and end-accuracies separately	2022-04-08 14:57:19 +02:00
Kádár Ákos	3ba913109d	update with eg.predited as other components	2022-04-07 13:20:12 +02:00
Kádár Ákos	ef141ad399	span accuracy score	2022-04-04 18:10:09 +02:00
Kádár Ákos	a1d0219903	prepare for aligned heads-spans training	2022-04-04 15:26:15 +02:00
Kádár Ákos	63a41ba50a	fix score overwriting bug	2022-03-30 17:28:20 +02:00
Kádár Ákos	7ff99a3acc	nicer restore	2022-03-28 18:16:41 +02:00
Kádár Ákos	06d680b269	addressing suggestions by @polm	2022-03-28 14:31:51 +02:00
Kádár Ákos	e4b4b67ef6	handle empty clusters	2022-03-28 11:29:00 +02:00
Kádár Ákos	7304604edd	make sure predicted and reference keeps aligned	2022-03-25 18:29:33 +01:00
Kádár Ákos	83ac0477c8	remove useless extra prefix and device from spanpredictor	2022-03-24 16:44:50 +01:00
Kádár Ákos	706b2e6f25	gearing up SpanPredictor for gold-heads	2022-03-24 16:06:20 +01:00
Kádár Ákos	1eaf8fb0cf	span predictor debug start	2022-03-23 11:24:27 +01:00
Paul O'Leary McCann	2190cbc0e6	Add progress on SpanPredictor component This isn't working. There is a CUDA error in the torch code during initialization and it's not clear why.	2022-03-19 19:39:49 +09:00
Paul O'Leary McCann	a098849112	Add fake batching The way fake batching works is that the pipeline component calls the model repeatedly in a loop internally. It feels like this should break something, but it worked in testing. Another issue is that this changes the signature of some of the pipeline functions, though I don't think that's an issue. Tested with batch size of 2, so more testing is needed, but this is a start.	2022-03-18 19:46:58 +09:00
Paul O'Leary McCann	1a79d18796	Formatting	2022-03-16 20:10:47 +09:00
Paul O'Leary McCann	6855df0e66	Skeleton for span predictor component This should be moved into its own file, but for now just stubbing out the methods.	2022-03-16 20:09:33 +09:00
Paul O'Leary McCann	7811a1194b	Change architecture	2022-03-16 14:57:15 +09:00
Paul O'Leary McCann	55039a66ad	Remove old default config	2022-03-15 19:53:09 +09:00
Paul O'Leary McCann	17d017a177	Remove span2head This doesn't work as a component because it needs to modify gold data, so instead it's a conversion script (in another repo).	2022-03-15 19:52:20 +09:00
Paul O'Leary McCann	0522a43116	Make span2head component	2022-03-15 19:19:15 +09:00
Paul O'Leary McCann	dfec6993d6	Training works now	2022-03-14 19:27:23 +09:00
Paul O'Leary McCann	8eadf3781b	Training runs now Evaluation needs fixing, and code still needs cleanup.	2022-03-14 19:02:17 +09:00
Paul O'Leary McCann	d22a002641	Forward/backward pass works Evaluate does not work - predict hasn't been updated	2022-03-14 17:26:27 +09:00
Paul O'Leary McCann	230698dc83	Fix bug in scorer Scoring code was just using one metric, not all three of interest.	2021-08-12 18:22:08 +09:00

1 2

72 Commits