spaCy

mirror of https://github.com/explosion/spaCy.git synced 2025-10-20 10:44:41 +03:00

Author	SHA1	Message	Date
Paul O'Leary McCann	838f50192b	Black formatting	2022-05-25 19:20:03 +09:00
Paul O'Leary McCann	2a8efda689	Code review suggestions, cleanup	2022-05-25 19:18:26 +09:00
Paul O'Leary McCann	c9233a5a1f	Import torch from thinc	2022-05-24 17:28:27 +09:00
Paul O'Leary McCann	b1118cee58	Move epsilon	2022-05-24 15:59:08 +09:00
Paul O'Leary McCann	9da16df96e	Add guards around torch import Torch is required for the coref/spanpred models but shouldn't be required for spaCy in general. The one tricky part of this is that one function in coref_util relied on torch, but that file was imported in several places. Since the function was only used in one place I moved it there.	2022-05-24 15:16:25 +09:00
Paul O'Leary McCann	2e8f0e9168	Rename coref params	2022-05-16 16:50:10 +09:00
Paul O'Leary McCann	13481fbcc2	Remove unused param, add TODOs about typing	2022-05-13 19:29:28 +09:00
kadarakos	7cf6bcca0e	merge misery	2022-05-10 17:19:16 +00:00
kadarakos	e512874c80	small refactor and docs	2022-05-10 16:40:31 +00:00
Paul O'Leary McCann	33f4f90ff0	Formatting	2022-05-10 19:09:52 +09:00
Paul O'Leary McCann	41fc092674	Split span predictor model into its own file	2022-05-10 19:08:21 +09:00
svlandeg	6b51258a58	clean up unused imports + black formatting	2022-05-09 13:34:50 +02:00
kadarakos	b53113e3b8	Preparing span predictor for predicting from gold (#10547 ) Note this is squashed because rebasing had conflicts. * remove unnecessary .device * span predictor debug start * gearing up SpanPredictor for gold-heads * merge SpanPredictor attributes * remove useless extra prefix and device from spanpredictor * make sure predicted and reference keeps aligned * handle empty head_ids * handle empty clusters * addressing suggestions by @polm * nicer restore * fix score overwriting bug * prepare for aligned heads-spans training * span accuracy score * update with eg.predited as other components * add backprop callback to spanpredictor * report start- and end-accuracies separately * fixing scorer Co-authored-by: Kádár Ákos <akos@onyx.uvt.nl>	2022-04-13 19:42:49 +09:00
Kádár Ákos	2a1ad4c5d2	add backprop callback to spanpredictor	2022-04-08 14:56:44 +02:00
Kádár Ákos	4fc40340f9	handle empty head_ids	2022-03-28 11:28:21 +02:00
Kádár Ákos	83ac0477c8	remove useless extra prefix and device from spanpredictor	2022-03-24 16:44:50 +01:00
Kádár Ákos	1c5dabcb47	merge SpanPredictor attributes	2022-03-24 16:23:12 +01:00
Kádár Ákos	a872c69ffb	merge	2022-03-24 16:10:04 +01:00
Kádár Ákos	706b2e6f25	gearing up SpanPredictor for gold-heads	2022-03-24 16:06:20 +01:00
Kádár Ákos	150e7c46d7	conflict	2022-03-23 11:27:02 +01:00
Kádár Ákos	1eaf8fb0cf	span predictor debug start	2022-03-23 11:24:27 +01:00
Paul O'Leary McCann	eec00ce60d	Fix various sizes in SpanPredictor FFNN	2022-03-23 16:20:31 +09:00
Paul O'Leary McCann	2190cbc0e6	Add progress on SpanPredictor component This isn't working. There is a CUDA error in the torch code during initialization and it's not clear why.	2022-03-19 19:39:49 +09:00
Kádár Ákos	db422abf01	remove unnecessary .device	2022-03-18 16:24:26 +01:00
Paul O'Leary McCann	0275ae29de	Remove stale comment	2022-03-16 20:09:12 +09:00
Paul O'Leary McCann	6974f55daa	Hack for transformer listener size	2022-03-16 15:15:53 +09:00
Paul O'Leary McCann	d0ae2590db	Delete all the coref-hoi code	2022-03-15 20:05:24 +09:00
Paul O'Leary McCann	abdc7d87af	Clean up util code Moved everything into coref_util.py, deleted wl-specific file.	2022-03-15 19:59:44 +09:00
Paul O'Leary McCann	8eadf3781b	Training runs now Evaluation needs fixing, and code still needs cleanup.	2022-03-14 19:02:17 +09:00
Paul O'Leary McCann	d22a002641	Forward/backward pass works Evaluate does not work - predict hasn't been updated	2022-03-14 17:26:27 +09:00
Paul O'Leary McCann	c4f9c24738	The coref model is able to be loaded The span predictor component is initialized but not used at all now. Plan is to work on it after the word level clustering part is trainable end-to-end.	2022-03-09 19:31:11 +09:00
Paul O'Leary McCann	35cc2b138f	Add span predictor code Accidentally omitted before	2022-03-08 18:13:26 +09:00
Paul O'Leary McCann	1c697b4011	Remove references to config Replaced with model arguments	2022-03-08 18:13:09 +09:00
Paul O'Leary McCann	c0cd5025e3	Start bringin in wl-coref This absolutely does not work. First step here is getting over most of the code in roughly the files we want it in. After the code has been pulled over it can be restructured to match spaCy and cleaned up.	2022-03-06 20:00:15 +09:00
Paul O'Leary McCann	00d481dd12	Stack the mention scorer In the reference implementations, there's usually a function to build a ffnn of arbitrary depth, consisting of a stack of Linear >> Relu >> Dropout. In practice the depth is always 1 in coref-hoi, but in earlier iterations of the model, which are more similar to our model here (since we aren't using attention or even necessarily BERT), using a small depth like 2 was common. This hard-codes a stack of 2. In brief tests this allows similar performance to the unstacked version with much smaller embedding sizes. The depth of the stack could be made into a hyperparameter.	2021-08-09 18:04:42 +09:00
Paul O'Leary McCann	56803d3909	Change mention limit to match reference implementations This generall means fewer spans are considered, which makes individual steps in training faster but can make training take longer to find the good spans.	2021-08-08 19:55:52 +09:00
Paul O'Leary McCann	8bd0474730	Run black	2021-07-18 20:20:22 +09:00
Paul O'Leary McCann	9b63cbb775	Add extract spans import	2021-07-15 18:16:53 +09:00
Paul O'Leary McCann	4a9dc00d86	Use relative indices for mentions Was using batch absolute indices to manage mentions, but extract_spans expects doc-relative ones.	2021-07-14 18:36:18 +09:00
Paul O'Leary McCann	c25ec292a9	Cleanup	2021-07-10 22:42:55 +09:00
Paul O'Leary McCann	e00bd422d9	Fix span embeds Some of the lengths and backprop weren't right. Also various cleanup.	2021-07-10 21:38:53 +09:00
Paul O'Leary McCann	d7d317a1b5	Clean up span embedding code This is now cleaner and significantly faster. There's still some messy parts in the code (particularly variable names), will get to that later.	2021-07-10 19:59:08 +09:00
Paul O'Leary McCann	f34915c1e8	Use scatter_add to speed up span embed backprop This was the slowest part of the code, and using scatter_add here probably reduces the runtime by 50%.	2021-07-10 18:08:51 +09:00
Paul O'Leary McCann	d0b041aff4	Switch to using Thinc tuplify The tuplify code here was added to Thinc proper and that's been released, so no need to have it here any more.	2021-07-08 16:08:36 +09:00
Paul O'Leary McCann	eb5820b593	Improve take_vecs implementation This pulls out references to needed bits so that other parts (the larger embeddings) can be freed before backprop.	2021-07-05 21:08:42 +09:00
Paul O'Leary McCann	13bef2ddb6	Add width prior feature Not necessary for convergence, but in coref-hoi this seems to add a few f1 points. Note that there are two width-related features in coref-hoi. This is a "prior" that is added to mention scores. The other width related feature is appended to the span embedding representation for other layers to reference.	2021-07-05 21:06:28 +09:00
Paul O'Leary McCann	8f66176b2d	Fix loss? This rewrites the loss to not use the Thinc crossentropy code at all. The main difference here is that the negative predictions are being masked out (= marginalized over), but negative gradient is still being reflected. I'm still not sure this is exactly right but models seem to train reliably now.	2021-07-05 18:17:10 +09:00
Paul O'Leary McCann	5db28ec2fd	Tweak mention limit calculation The calculation of this in the coref-hoi code is hard to follow. Based on comments and variable names it sounds like it's using the doc length, but it might actually be the number of mentions? Number of mentions should be much larger and seems more correct, but might want to revisit this.	2021-07-03 21:13:32 +09:00
Paul O'Leary McCann	865caedebd	Remove XXX comment Comment wondered if there should be some subtraction to avoid double counting, but it probably doesn't matter because the diagonal is 0.	2021-07-03 18:40:38 +09:00
Paul O'Leary McCann	f2e0e9dc28	Move placeholder handling into model code	2021-07-03 18:38:48 +09:00

1 2

75 Commits