From 00d481dd12c1fc6ed5a9ef865f775159b76ac2c4 Mon Sep 17 00:00:00 2001 From: Paul O'Leary McCann Date: Mon, 9 Aug 2021 18:04:42 +0900 Subject: [PATCH] Stack the mention scorer In the reference implementations, there's usually a function to build a ffnn of arbitrary depth, consisting of a stack of Linear >> Relu >> Dropout. In practice the depth is always 1 in coref-hoi, but in earlier iterations of the model, which are more similar to our model here (since we aren't using attention or even necessarily BERT), using a small depth like 2 was common. This hard-codes a stack of 2. In brief tests this allows similar performance to the unstacked version with much smaller embedding sizes. The depth of the stack could be made into a hyperparameter. --- spacy/ml/models/coref.py | 3 +++ 1 file changed, 3 insertions(+) diff --git a/spacy/ml/models/coref.py b/spacy/ml/models/coref.py index 3b14e6ecb..511e44476 100644 --- a/spacy/ml/models/coref.py +++ b/spacy/ml/models/coref.py @@ -36,6 +36,9 @@ def build_coref( Linear(nI=dim, nO=hidden) >> Relu(nI=hidden, nO=hidden) >> Dropout(dropout) + >> Linear(nI=hidden, nO=hidden) + >> Relu(nI=hidden, nO=hidden) + >> Dropout(dropout) >> Linear(nI=hidden, nO=1) ) mention_scorer.initialize()