spaCy

mirror of https://github.com/explosion/spaCy.git synced 2025-10-20 18:54:21 +03:00

Author	SHA1	Message	Date
Paul O'Leary McCann	5db28ec2fd	Tweak mention limit calculation The calculation of this in the coref-hoi code is hard to follow. Based on comments and variable names it sounds like it's using the doc length, but it might actually be the number of mentions? Number of mentions should be much larger and seems more correct, but might want to revisit this.	2021-07-03 21:13:32 +09:00
Paul O'Leary McCann	865caedebd	Remove XXX comment Comment wondered if there should be some subtraction to avoid double counting, but it probably doesn't matter because the diagonal is 0.	2021-07-03 18:40:38 +09:00
Paul O'Leary McCann	f2e0e9dc28	Move placeholder handling into model code	2021-07-03 18:38:48 +09:00
Paul O'Leary McCann	3f66e18592	Clean up pw_prod loss This doesn't change the math but makes the transposes slightly easier to understand (maybe?).	2021-07-03 18:33:17 +09:00
Paul O'Leary McCann	5c98c4c3b9	Probably fix pw prod backprop I think this change is correct, but intuition doesn't really help here...	2021-06-17 21:23:00 +09:00
Paul O'Leary McCann	ccf561112a	Remove old comments	2021-06-17 21:22:17 +09:00
Paul O'Leary McCann	a62121e3b4	Expose more hyperparameters	2021-06-17 21:21:46 +09:00
Paul O'Leary McCann	cb2364cf83	Fix type of mask The call here was creating a float64 array, which was turning many downstream scores into float64s. Later on these values were assigned to a float32 array in backprop, and numerical underflow caused things to go to zero. That's almost certainly not the only reason things go to zero, but it is incorrect.	2021-06-17 17:56:00 +09:00
Paul O'Leary McCann	8452d117ef	Fix typo, remove old comment	2021-06-13 19:42:55 +09:00
Paul O'Leary McCann	d71198ed36	Replace squeeze with flatten At a few points in the code it's normal to get a "2d" array where each row is a single entry. Calling squeeze will make that a proper 1d array... unless it's just one entry, in which case it turns into a 0d scalar. That's not what we want; flatten() provides the desired behavior.	2021-06-12 19:48:01 +09:00
Paul O'Leary McCann	e728b0e45d	Silence warning	2021-06-12 19:31:35 +09:00
Paul O'Leary McCann	18444fccd9	Remove old comment	2021-06-04 17:56:08 +09:00
svlandeg	0aa1083ce8	avoid repetitive entities in the output	2021-05-28 16:52:51 +02:00
svlandeg	0f5c586e2f	add basic tests for debugging	2021-05-28 14:19:55 +02:00
svlandeg	391b512afd	fix types of fwd functions	2021-05-27 16:36:46 +02:00
svlandeg	04b55bf054	removing unused imports	2021-05-27 16:31:38 +02:00
svlandeg	910026582d	set versions to v1 instead of v0	2021-05-27 16:17:20 +02:00
Paul O'Leary McCann	d6389b133d	Don't use a generator for no reason	2021-05-24 19:06:15 +09:00
Paul O'Leary McCann	d6fd5fe1c0	Minor cleanup	2021-05-24 14:56:43 +09:00
Paul O'Leary McCann	ff3fed06cf	Catch a stray reference	2021-05-20 21:30:46 +09:00
Paul O'Leary McCann	8c5df622d8	Help out python gc in coref backprop	2021-05-20 16:40:55 +09:00
Paul O'Leary McCann	fa92daf052	Break pairwise operations into pseudolayers This makes their scope tighter and more contained, and has the nice side effect that fewer things need to be passed around for backprop.	2021-05-20 15:59:51 +09:00
Paul O'Leary McCann	0620820857	Deal with generators in tuplify	2021-05-18 19:55:52 +09:00
Paul O'Leary McCann	a7d9c8156d	Make get_sentence_map work with init When sentences are not available, just treat the whole doc as one sentence. A reasonable general fallback, but important due to the init call, where upstream components aren't run.	2021-05-18 19:54:54 +09:00
Paul O'Leary McCann	883c137b26	Add basic tuplify init	2021-05-18 19:53:59 +09:00
Paul O'Leary McCann	051715506e	Fiddle with get_mentions definition Ended up not making a difference, but oh well.	2021-05-18 19:53:33 +09:00
Paul O'Leary McCann	7c42a8c90a	Migrate coref code This includes the coref code that was being tested separately, modified to work in spaCy. It hasn't been tested yet and presumably still needs fixes. In particular, the evaluation code is currently omitted. It's unclear at the moment whether we want to use a complex scorer similar to the official one, or a simpler scorer using more modern evaluation methods.	2021-05-15 21:36:10 +09:00
Sofie Van Landeghem	e0c45c669a	Native coref component (#7243 ) * initial coref_er pipe * matcher more flexible * base coref component without actual model * initial setup of coref_er.score * rename to include_label * preliminary score_clusters method * apply scoring in coref component * IO fix * return None loss for now * rename to CoreferenceResolver * some preliminary unit tests * use registry as callable	2021-03-03 13:50:14 +01:00

28 Commits