spaCy/spacy/ml/models/parser.py
Sofie Van Landeghem 311133e579
Train textcat with config (#5143)
* bring back default build_text_classifier method

* remove _set_dims_ hack in favor of proper dim inference

* add tok2vec initialize to unit test

* small fixes

* add unit test for various textcat config settings

* logistic output layer does not have nO

* fix window_size setting

* proper fix

* fix W initialization

* Update textcat training example

* Use ml_datasets
* Convert training data to `Example` format
* Use `n_texts` to set proportionate dev size

* fix _init renaming on latest thinc

* avoid setting a non-existing dim

* update to thinc==8.0.0a2

* add BOW and CNN defaults for easy testing

* various experiments with train_textcat script, fix softmax activation in textcat bow

* allow textcat train script to work on other datasets as well

* have dataset as a parameter

* train textcat from config, with example config

* add config for training textcat

* formatting

* fix exclusive_classes

* fixing BOW for GPU

* bump thinc to 8.0.0a3 (not published yet so CI will fail)

* add in link_vectors_to_models which got deleted

Co-authored-by: Adriane Boyd <adrianeboyd@gmail.com>
2020-03-29 19:40:36 +02:00

33 lines
1004 B
Python

from pydantic import StrictInt
from thinc.api import Model, chain, list2array, Linear, zero_init, use_ops
from ...util import registry
from .._precomputable_affine import PrecomputableAffine
from ...syntax._parser_model import ParserModel
@registry.architectures.register("spacy.TransitionBasedParser.v1")
def build_tb_parser_model(
tok2vec: Model,
nr_feature_tokens: StrictInt,
hidden_width: StrictInt,
maxout_pieces: StrictInt,
nO=None,
):
token_vector_width = tok2vec.get_dim("nO")
tok2vec = chain(tok2vec, list2array())
tok2vec.set_dim("nO", token_vector_width)
lower = PrecomputableAffine(
nO=hidden_width,
nF=nr_feature_tokens,
nI=tok2vec.get_dim("nO"),
nP=maxout_pieces,
)
lower.set_dim("nP", maxout_pieces)
with use_ops("numpy"):
# Initialize weights at zero, as it's a classification layer.
upper = Linear(nO=nO, init_W=zero_init)
model = ParserModel(tok2vec, lower, upper)
return model