spaCy/examples/training
Sofie Van Landeghem 311133e579
Train textcat with config (#5143)
* bring back default build_text_classifier method

* remove _set_dims_ hack in favor of proper dim inference

* add tok2vec initialize to unit test

* small fixes

* add unit test for various textcat config settings

* logistic output layer does not have nO

* fix window_size setting

* proper fix

* fix W initialization

* Update textcat training example

* Use ml_datasets
* Convert training data to `Example` format
* Use `n_texts` to set proportionate dev size

* fix _init renaming on latest thinc

* avoid setting a non-existing dim

* update to thinc==8.0.0a2

* add BOW and CNN defaults for easy testing

* various experiments with train_textcat script, fix softmax activation in textcat bow

* allow textcat train script to work on other datasets as well

* have dataset as a parameter

* train textcat from config, with example config

* add config for training textcat

* formatting

* fix exclusive_classes

* fixing BOW for GPU

* bump thinc to 8.0.0a3 (not published yet so CI will fail)

* add in link_vectors_to_models which got deleted

Co-authored-by: Adriane Boyd <adrianeboyd@gmail.com>
2020-03-29 19:40:36 +02:00
..
ner_example_data Updates/bugfixes for NER/IOB converters (#4186) 2019-08-29 12:04:01 +02:00
textcat_example_data Add textcat to train CLI (#4226) 2019-09-15 22:31:31 +02:00
conllu-config.json Generalize handling of tokenizer special cases (#4259) 2019-11-13 21:24:35 +01:00
conllu.py Merge branch 'master' into develop 2019-12-21 18:55:03 +01:00
ner_multitask_objective.py Example class for training data (#4543) 2019-11-11 17:35:27 +01:00
pretrain_kb.py Friendly error warning for NEL example script (#4881) 2020-01-14 01:51:14 +01:00
pretrain_textcat.py Default settings to configurations (#4995) 2020-02-27 18:42:27 +01:00
rehearsal.py Merge branch 'master' into develop 2020-02-18 14:47:23 +01:00
train_entity_linker.py Merge branch 'master' into develop 2020-02-18 14:47:23 +01:00
train_intent_parser.py Merge branch 'master' into develop 2020-02-18 14:47:23 +01:00
train_ner.py Merge branch 'master' into develop 2020-02-18 14:47:23 +01:00
train_new_entity_type.py Merge branch 'master' into develop 2020-02-18 14:47:23 +01:00
train_parser.py Merge branch 'master' into develop 2020-02-18 14:47:23 +01:00
train_tagger.py Example class for training data (#4543) 2019-11-11 17:35:27 +01:00
train_textcat_config.cfg Train textcat with config (#5143) 2020-03-29 19:40:36 +02:00
train_textcat.py Train textcat with config (#5143) 2020-03-29 19:40:36 +02:00
training-data.json Revert training example edit from #4327 (#4403) 2019-10-10 17:00:26 +02:00
vocab-data.jsonl Use even smaller examle size 2017-10-30 19:46:45 +01:00