spaCy/spacy/tests/pipeline
Sofie Van Landeghem 311133e579
Train textcat with config (#5143)
* bring back default build_text_classifier method

* remove _set_dims_ hack in favor of proper dim inference

* add tok2vec initialize to unit test

* small fixes

* add unit test for various textcat config settings

* logistic output layer does not have nO

* fix window_size setting

* proper fix

* fix W initialization

* Update textcat training example

* Use ml_datasets
* Convert training data to `Example` format
* Use `n_texts` to set proportionate dev size

* fix _init renaming on latest thinc

* avoid setting a non-existing dim

* update to thinc==8.0.0a2

* add BOW and CNN defaults for easy testing

* various experiments with train_textcat script, fix softmax activation in textcat bow

* allow textcat train script to work on other datasets as well

* have dataset as a parameter

* train textcat from config, with example config

* add config for training textcat

* formatting

* fix exclusive_classes

* fixing BOW for GPU

* bump thinc to 8.0.0a3 (not published yet so CI will fail)

* add in link_vectors_to_models which got deleted

Co-authored-by: Adriane Boyd <adrianeboyd@gmail.com>
2020-03-29 19:40:36 +02:00
..
__init__.py Revert #4334 2019-09-29 17:32:12 +02:00
test_analysis.py Default settings to configurations (#4995) 2020-02-27 18:42:27 +01:00
test_entity_linker.py Unit test for NEL functionality (#5114) 2020-03-06 14:42:23 +01:00
test_entity_ruler.py Tidy up and auto-format 2020-02-18 15:38:18 +01:00
test_factories.py Drop Python 2.7 and 3.5 (#4828) 2019-12-22 01:53:56 +01:00
test_functions.py Drop Python 2.7 and 3.5 (#4828) 2019-12-22 01:53:56 +01:00
test_pipe_methods.py More formatting changes 2019-12-25 17:59:52 +01:00
test_sentencizer.py Merge branch 'master' into develop 2020-02-18 14:47:23 +01:00
test_senter.py Tok2Vec: extract-embed-encode (#5102) 2020-03-08 13:23:18 +01:00
test_tagger.py Default settings to configurations (#4995) 2020-02-27 18:42:27 +01:00
test_textcat.py Train textcat with config (#5143) 2020-03-29 19:40:36 +02:00