spaCy

mirror of https://github.com/explosion/spaCy.git synced 2025-10-20 18:54:21 +03:00

Author	SHA1	Message	Date
Kádár Ákos	83ac0477c8	remove useless extra prefix and device from spanpredictor	2022-03-24 16:44:50 +01:00
Kádár Ákos	1c5dabcb47	merge SpanPredictor attributes	2022-03-24 16:23:12 +01:00
Kádár Ákos	a872c69ffb	merge	2022-03-24 16:10:04 +01:00
Kádár Ákos	706b2e6f25	gearing up SpanPredictor for gold-heads	2022-03-24 16:06:20 +01:00
Kádár Ákos	150e7c46d7	conflict	2022-03-23 11:27:02 +01:00
Kádár Ákos	1eaf8fb0cf	span predictor debug start	2022-03-23 11:24:27 +01:00
Paul O'Leary McCann	eec00ce60d	Fix various sizes in SpanPredictor FFNN	2022-03-23 16:20:31 +09:00
Paul O'Leary McCann	2190cbc0e6	Add progress on SpanPredictor component This isn't working. There is a CUDA error in the torch code during initialization and it's not clear why.	2022-03-19 19:39:49 +09:00
Kádár Ákos	db422abf01	remove unnecessary .device	2022-03-18 16:24:26 +01:00
Paul O'Leary McCann	0275ae29de	Remove stale comment	2022-03-16 20:09:12 +09:00
Paul O'Leary McCann	6974f55daa	Hack for transformer listener size	2022-03-16 15:15:53 +09:00
Paul O'Leary McCann	d0ae2590db	Delete all the coref-hoi code	2022-03-15 20:05:24 +09:00
Paul O'Leary McCann	abdc7d87af	Clean up util code Moved everything into coref_util.py, deleted wl-specific file.	2022-03-15 19:59:44 +09:00
Paul O'Leary McCann	8eadf3781b	Training runs now Evaluation needs fixing, and code still needs cleanup.	2022-03-14 19:02:17 +09:00
Paul O'Leary McCann	d22a002641	Forward/backward pass works Evaluate does not work - predict hasn't been updated	2022-03-14 17:26:27 +09:00
Paul O'Leary McCann	c4f9c24738	The coref model is able to be loaded The span predictor component is initialized but not used at all now. Plan is to work on it after the word level clustering part is trainable end-to-end.	2022-03-09 19:31:11 +09:00
Paul O'Leary McCann	35cc2b138f	Add span predictor code Accidentally omitted before	2022-03-08 18:13:26 +09:00
Paul O'Leary McCann	1c697b4011	Remove references to config Replaced with model arguments	2022-03-08 18:13:09 +09:00
Paul O'Leary McCann	c0cd5025e3	Start bringin in wl-coref This absolutely does not work. First step here is getting over most of the code in roughly the files we want it in. After the code has been pulled over it can be restructured to match spaCy and cleaned up.	2022-03-06 20:00:15 +09:00
Paul O'Leary McCann	00d481dd12	Stack the mention scorer In the reference implementations, there's usually a function to build a ffnn of arbitrary depth, consisting of a stack of Linear >> Relu >> Dropout. In practice the depth is always 1 in coref-hoi, but in earlier iterations of the model, which are more similar to our model here (since we aren't using attention or even necessarily BERT), using a small depth like 2 was common. This hard-codes a stack of 2. In brief tests this allows similar performance to the unstacked version with much smaller embedding sizes. The depth of the stack could be made into a hyperparameter.	2021-08-09 18:04:42 +09:00
Paul O'Leary McCann	56803d3909	Change mention limit to match reference implementations This generall means fewer spans are considered, which makes individual steps in training faster but can make training take longer to find the good spans.	2021-08-08 19:55:52 +09:00
Paul O'Leary McCann	8bd0474730	Run black	2021-07-18 20:20:22 +09:00
Paul O'Leary McCann	9b63cbb775	Add extract spans import	2021-07-15 18:16:53 +09:00
Paul O'Leary McCann	4a9dc00d86	Use relative indices for mentions Was using batch absolute indices to manage mentions, but extract_spans expects doc-relative ones.	2021-07-14 18:36:18 +09:00
Paul O'Leary McCann	c25ec292a9	Cleanup	2021-07-10 22:42:55 +09:00
Paul O'Leary McCann	e00bd422d9	Fix span embeds Some of the lengths and backprop weren't right. Also various cleanup.	2021-07-10 21:38:53 +09:00
Paul O'Leary McCann	d7d317a1b5	Clean up span embedding code This is now cleaner and significantly faster. There's still some messy parts in the code (particularly variable names), will get to that later.	2021-07-10 19:59:08 +09:00
Paul O'Leary McCann	f34915c1e8	Use scatter_add to speed up span embed backprop This was the slowest part of the code, and using scatter_add here probably reduces the runtime by 50%.	2021-07-10 18:08:51 +09:00
Paul O'Leary McCann	d0b041aff4	Switch to using Thinc tuplify The tuplify code here was added to Thinc proper and that's been released, so no need to have it here any more.	2021-07-08 16:08:36 +09:00
Paul O'Leary McCann	eb5820b593	Improve take_vecs implementation This pulls out references to needed bits so that other parts (the larger embeddings) can be freed before backprop.	2021-07-05 21:08:42 +09:00
Paul O'Leary McCann	13bef2ddb6	Add width prior feature Not necessary for convergence, but in coref-hoi this seems to add a few f1 points. Note that there are two width-related features in coref-hoi. This is a "prior" that is added to mention scores. The other width related feature is appended to the span embedding representation for other layers to reference.	2021-07-05 21:06:28 +09:00
Paul O'Leary McCann	8f66176b2d	Fix loss? This rewrites the loss to not use the Thinc crossentropy code at all. The main difference here is that the negative predictions are being masked out (= marginalized over), but negative gradient is still being reflected. I'm still not sure this is exactly right but models seem to train reliably now.	2021-07-05 18:17:10 +09:00
Paul O'Leary McCann	5db28ec2fd	Tweak mention limit calculation The calculation of this in the coref-hoi code is hard to follow. Based on comments and variable names it sounds like it's using the doc length, but it might actually be the number of mentions? Number of mentions should be much larger and seems more correct, but might want to revisit this.	2021-07-03 21:13:32 +09:00
Paul O'Leary McCann	865caedebd	Remove XXX comment Comment wondered if there should be some subtraction to avoid double counting, but it probably doesn't matter because the diagonal is 0.	2021-07-03 18:40:38 +09:00
Paul O'Leary McCann	f2e0e9dc28	Move placeholder handling into model code	2021-07-03 18:38:48 +09:00
Paul O'Leary McCann	3f66e18592	Clean up pw_prod loss This doesn't change the math but makes the transposes slightly easier to understand (maybe?).	2021-07-03 18:33:17 +09:00
Paul O'Leary McCann	5c98c4c3b9	Probably fix pw prod backprop I think this change is correct, but intuition doesn't really help here...	2021-06-17 21:23:00 +09:00
Paul O'Leary McCann	ccf561112a	Remove old comments	2021-06-17 21:22:17 +09:00
Paul O'Leary McCann	a62121e3b4	Expose more hyperparameters	2021-06-17 21:21:46 +09:00
Paul O'Leary McCann	cb2364cf83	Fix type of mask The call here was creating a float64 array, which was turning many downstream scores into float64s. Later on these values were assigned to a float32 array in backprop, and numerical underflow caused things to go to zero. That's almost certainly not the only reason things go to zero, but it is incorrect.	2021-06-17 17:56:00 +09:00
Paul O'Leary McCann	8452d117ef	Fix typo, remove old comment	2021-06-13 19:42:55 +09:00
Paul O'Leary McCann	d71198ed36	Replace squeeze with flatten At a few points in the code it's normal to get a "2d" array where each row is a single entry. Calling squeeze will make that a proper 1d array... unless it's just one entry, in which case it turns into a 0d scalar. That's not what we want; flatten() provides the desired behavior.	2021-06-12 19:48:01 +09:00
Paul O'Leary McCann	e728b0e45d	Silence warning	2021-06-12 19:31:35 +09:00
Paul O'Leary McCann	18444fccd9	Remove old comment	2021-06-04 17:56:08 +09:00
svlandeg	0aa1083ce8	avoid repetitive entities in the output	2021-05-28 16:52:51 +02:00
svlandeg	0f5c586e2f	add basic tests for debugging	2021-05-28 14:19:55 +02:00
svlandeg	391b512afd	fix types of fwd functions	2021-05-27 16:36:46 +02:00
svlandeg	04b55bf054	removing unused imports	2021-05-27 16:31:38 +02:00
svlandeg	910026582d	set versions to v1 instead of v0	2021-05-27 16:17:20 +02:00
Paul O'Leary McCann	d6389b133d	Don't use a generator for no reason	2021-05-24 19:06:15 +09:00

1 2

60 Commits