mirror of
https://github.com/explosion/spaCy.git
synced 2025-07-18 20:22:25 +03:00
Stack the mention scorer
In the reference implementations, there's usually a function to build a ffnn of arbitrary depth, consisting of a stack of Linear >> Relu >> Dropout. In practice the depth is always 1 in coref-hoi, but in earlier iterations of the model, which are more similar to our model here (since we aren't using attention or even necessarily BERT), using a small depth like 2 was common. This hard-codes a stack of 2. In brief tests this allows similar performance to the unstacked version with much smaller embedding sizes. The depth of the stack could be made into a hyperparameter.
This commit is contained in:
parent
56803d3909
commit
00d481dd12
|
@ -36,6 +36,9 @@ def build_coref(
|
||||||
Linear(nI=dim, nO=hidden)
|
Linear(nI=dim, nO=hidden)
|
||||||
>> Relu(nI=hidden, nO=hidden)
|
>> Relu(nI=hidden, nO=hidden)
|
||||||
>> Dropout(dropout)
|
>> Dropout(dropout)
|
||||||
|
>> Linear(nI=hidden, nO=hidden)
|
||||||
|
>> Relu(nI=hidden, nO=hidden)
|
||||||
|
>> Dropout(dropout)
|
||||||
>> Linear(nI=hidden, nO=1)
|
>> Linear(nI=hidden, nO=1)
|
||||||
)
|
)
|
||||||
mention_scorer.initialize()
|
mention_scorer.initialize()
|
||||||
|
|
Loading…
Reference in New Issue
Block a user