Paul O'Leary McCann
64a0bf4460
Merge branch 'feature/coref' into coref/dimension-inference
2022-07-12 12:56:10 +09:00
Paul O'Leary McCann
baeb35f31b
Add type annotations for internal models
2022-07-11 20:03:29 +09:00
Paul O'Leary McCann
4d032396b8
Merge branch 'feature/coref' into coref/dimension-inference
2022-07-11 19:18:46 +09:00
Paul O'Leary McCann
6d9eafeb37
Merge branch 'feature/coref' into fix/coref-alignment
2022-07-11 19:14:37 +09:00
Paul O'Leary McCann
2eee0d248e
Fix types
...
mypy now exits without an error, except for two apparently unrelated
ones about setup.py.
2022-07-08 18:29:14 +09:00
Paul O'Leary McCann
b59b924e49
Use normal PyTorchWrapper in coref
2022-07-06 19:22:19 +09:00
Paul O'Leary McCann
f67c1735c5
Remove tok2vec_size from coref
2022-07-06 18:58:57 +09:00
Paul O'Leary McCann
bd17c38b74
It works!
...
Was missing the serialization-related code from biaffine.
2022-07-06 18:58:22 +09:00
Paul O'Leary McCann
ba1bf8ae72
First take at dimension inference
...
This follows the pattern used in the Biaffine Parser, which uses an init
function to get the size only after the tok2vec is available.
This works at first, but serialization fails with an error.
2022-07-06 18:40:05 +09:00
Paul O'Leary McCann
619b1102e6
Use config to specify tok2vec_size
2022-07-03 15:32:35 +09:00
Paul O'Leary McCann
79720886fa
Merge branch 'feature/coref' into fix/coref-alignment
...
Had to renumber error message.
2022-07-01 19:09:29 +09:00
kadarakos
1a782592c4
make sure same device
2022-06-28 12:53:20 +00:00
Paul O'Leary McCann
ef5762d78e
Bad hack to get tests to run
...
This changes the tok2vec size in coref to hardcoded 64 to get tests to
run. This should be reverted and hopefully replaced with proper shape
inference.
2022-06-28 19:06:13 +09:00
Paul O'Leary McCann
16894e665d
Refactor Coval Scoring code ( #10875 )
...
* Move coref scoring code to scorer.py
Includes some renames to make names less generic.
* Refactor coval code to remove ternary expressions
* Black formatting
* Add header
* Make scorers into registered scorers
* Small test fixes
* Skip coref tests when torch not present
Coref can't be loaded without Torch, so nothing works.
* Fix remaining type issues
Some of this just involves ignoring types in thorny areas. Two main
issues:
1. Some things have weird types due to indirection/ argskwargs
2. xp2torch return type seems to have changed at some point
* Update spacy/scorer.py
Co-authored-by: kadarakos <kadar.akos@gmail.com>
* Small changes from review
* Be specific about the ValueError
* Type fix
Co-authored-by: kadarakos <kadar.akos@gmail.com>
2022-06-22 16:05:52 +09:00
Paul O'Leary McCann
196886bbca
Fix coref size inference ( #10916 )
...
* Add explicit tok2vec_size parameter in clusterer
* Add tok2vec size to span predictor config
* Minor fixes
2022-06-08 20:03:41 +09:00
svlandeg
cea40c9d7b
fix types + black formatting
2022-05-25 13:34:09 +02:00
Paul O'Leary McCann
838f50192b
Black formatting
2022-05-25 19:20:03 +09:00
Paul O'Leary McCann
2a8efda689
Code review suggestions, cleanup
2022-05-25 19:18:26 +09:00
Paul O'Leary McCann
c9233a5a1f
Import torch from thinc
2022-05-24 17:28:27 +09:00
Paul O'Leary McCann
b1118cee58
Move epsilon
2022-05-24 15:59:08 +09:00
Paul O'Leary McCann
9da16df96e
Add guards around torch import
...
Torch is required for the coref/spanpred models but shouldn't be
required for spaCy in general.
The one tricky part of this is that one function in coref_util relied on
torch, but that file was imported in several places. Since the function
was only used in one place I moved it there.
2022-05-24 15:16:25 +09:00
Paul O'Leary McCann
2e8f0e9168
Rename coref params
2022-05-16 16:50:10 +09:00
Paul O'Leary McCann
13481fbcc2
Remove unused param, add TODOs about typing
2022-05-13 19:29:28 +09:00
kadarakos
7cf6bcca0e
merge misery
2022-05-10 17:19:16 +00:00
kadarakos
e512874c80
small refactor and docs
2022-05-10 16:40:31 +00:00
Paul O'Leary McCann
33f4f90ff0
Formatting
2022-05-10 19:09:52 +09:00
Paul O'Leary McCann
41fc092674
Split span predictor model into its own file
2022-05-10 19:08:21 +09:00
svlandeg
6b51258a58
clean up unused imports + black formatting
2022-05-09 13:34:50 +02:00
kadarakos
b53113e3b8
Preparing span predictor for predicting from gold ( #10547 )
...
Note this is squashed because rebasing had conflicts.
* remove unnecessary .device
* span predictor debug start
* gearing up SpanPredictor for gold-heads
* merge SpanPredictor attributes
* remove useless extra prefix and device from spanpredictor
* make sure predicted and reference keeps aligned
* handle empty head_ids
* handle empty clusters
* addressing suggestions by @polm
* nicer restore
* fix score overwriting bug
* prepare for aligned heads-spans training
* span accuracy score
* update with eg.predited as other components
* add backprop callback to spanpredictor
* report start- and end-accuracies separately
* fixing scorer
Co-authored-by: Kádár Ákos <akos@onyx.uvt.nl>
2022-04-13 19:42:49 +09:00
Kádár Ákos
2a1ad4c5d2
add backprop callback to spanpredictor
2022-04-08 14:56:44 +02:00
Kádár Ákos
4fc40340f9
handle empty head_ids
2022-03-28 11:28:21 +02:00
Kádár Ákos
83ac0477c8
remove useless extra prefix and device from spanpredictor
2022-03-24 16:44:50 +01:00
Kádár Ákos
1c5dabcb47
merge SpanPredictor attributes
2022-03-24 16:23:12 +01:00
Kádár Ákos
a872c69ffb
merge
2022-03-24 16:10:04 +01:00
Kádár Ákos
706b2e6f25
gearing up SpanPredictor for gold-heads
2022-03-24 16:06:20 +01:00
Kádár Ákos
150e7c46d7
conflict
2022-03-23 11:27:02 +01:00
Kádár Ákos
1eaf8fb0cf
span predictor debug start
2022-03-23 11:24:27 +01:00
Paul O'Leary McCann
eec00ce60d
Fix various sizes in SpanPredictor FFNN
2022-03-23 16:20:31 +09:00
Paul O'Leary McCann
2190cbc0e6
Add progress on SpanPredictor component
...
This isn't working. There is a CUDA error in the torch code during
initialization and it's not clear why.
2022-03-19 19:39:49 +09:00
Kádár Ákos
db422abf01
remove unnecessary .device
2022-03-18 16:24:26 +01:00
Paul O'Leary McCann
0275ae29de
Remove stale comment
2022-03-16 20:09:12 +09:00
Paul O'Leary McCann
6974f55daa
Hack for transformer listener size
2022-03-16 15:15:53 +09:00
Paul O'Leary McCann
d0ae2590db
Delete all the coref-hoi code
2022-03-15 20:05:24 +09:00
Paul O'Leary McCann
abdc7d87af
Clean up util code
...
Moved everything into coref_util.py, deleted wl-specific file.
2022-03-15 19:59:44 +09:00
Paul O'Leary McCann
8eadf3781b
Training runs now
...
Evaluation needs fixing, and code still needs cleanup.
2022-03-14 19:02:17 +09:00
Paul O'Leary McCann
d22a002641
Forward/backward pass works
...
Evaluate does not work - predict hasn't been updated
2022-03-14 17:26:27 +09:00
Paul O'Leary McCann
c4f9c24738
The coref model is able to be loaded
...
The span predictor component is initialized but not used at all now.
Plan is to work on it after the word level clustering part is trainable
end-to-end.
2022-03-09 19:31:11 +09:00
Paul O'Leary McCann
35cc2b138f
Add span predictor code
...
Accidentally omitted before
2022-03-08 18:13:26 +09:00
Paul O'Leary McCann
1c697b4011
Remove references to config
...
Replaced with model arguments
2022-03-08 18:13:09 +09:00
Paul O'Leary McCann
c0cd5025e3
Start bringin in wl-coref
...
This absolutely does not work. First step here is getting over most of
the code in roughly the files we want it in. After the code has been
pulled over it can be restructured to match spaCy and cleaned up.
2022-03-06 20:00:15 +09:00