Stack the mention scorer

In the reference implementations, there's usually a function to build a
ffnn of arbitrary depth, consisting of a stack of Linear >> Relu >>
Dropout. In practice the depth is always 1 in coref-hoi, but in earlier
iterations of the model, which are more similar to our model here (since
we aren't using attention or even necessarily BERT), using a small depth
like 2 was common. This hard-codes a stack of 2.

In brief tests this allows similar performance to the unstacked version
with much smaller embedding sizes.

The depth of the stack could be made into a hyperparameter.
This commit is contained in:
Paul O'Leary McCann 2021-08-09 18:04:42 +09:00
parent 56803d3909
commit 00d481dd12

View File

@ -36,6 +36,9 @@ def build_coref(
Linear(nI=dim, nO=hidden)
>> Relu(nI=hidden, nO=hidden)
>> Dropout(dropout)
>> Linear(nI=hidden, nO=hidden)
>> Relu(nI=hidden, nO=hidden)
>> Dropout(dropout)
>> Linear(nI=hidden, nO=1)
)
mention_scorer.initialize()