spaCy/examples
Sofie Van Landeghem 311133e579
Train textcat with config (#5143)
* bring back default build_text_classifier method

* remove _set_dims_ hack in favor of proper dim inference

* add tok2vec initialize to unit test

* small fixes

* add unit test for various textcat config settings

* logistic output layer does not have nO

* fix window_size setting

* proper fix

* fix W initialization

* Update textcat training example

* Use ml_datasets
* Convert training data to `Example` format
* Use `n_texts` to set proportionate dev size

* fix _init renaming on latest thinc

* avoid setting a non-existing dim

* update to thinc==8.0.0a2

* add BOW and CNN defaults for easy testing

* various experiments with train_textcat script, fix softmax activation in textcat bow

* allow textcat train script to work on other datasets as well

* have dataset as a parameter

* train textcat from config, with example config

* add config for training textcat

* formatting

* fix exclusive_classes

* fixing BOW for GPU

* bump thinc to 8.0.0a3 (not published yet so CI will fail)

* add in link_vectors_to_models which got deleted

Co-authored-by: Adriane Boyd <adrianeboyd@gmail.com>
2020-03-29 19:40:36 +02:00
..
experiments Tok2Vec: extract-embed-encode (#5102) 2020-03-08 13:23:18 +01:00
information_extraction Remove max_length parameter 2020-03-24 10:22:12 +01:00
keras_parikh_entailment Fix unicode strings in examples [ci skip] 2019-10-18 18:47:59 +02:00
notebooks 💫 Replace ujson, msgpack and dill/pickle/cloudpickle with srsly (#3003) 2018-12-03 01:28:22 +01:00
pipeline Update spaCy for thinc 8.0.0 (#4920) 2020-01-29 17:06:46 +01:00
training Train textcat with config (#5143) 2020-03-29 19:40:36 +02:00
deep_learning_keras.py Update spaCy for thinc 8.0.0 (#4920) 2020-01-29 17:06:46 +01:00
load_from_docbin.py Generalize handling of tokenizer special cases (#4259) 2019-11-13 21:24:35 +01:00
README.md Get docs ready for v2.0.0 2017-11-07 12:00:43 +01:00
streamlit_spacy.py fix showing dep arcs in streamlit script 2020-03-19 10:30:20 +01:00
vectors_fast_text.py Auto-format examples 2018-12-02 04:26:26 +01:00
vectors_tensorboard.py Restore tqdm imports (#4804) 2019-12-16 13:12:19 +01:00

spaCy examples

The examples are Python scripts with well-behaved command line interfaces. For more detailed usage guides, see the documentation.

To see the available arguments, you can use the --help or -h flag:

$ python examples/training/train_ner.py --help

While we try to keep the examples up to date, they are not currently exercised by the test suite, as some of them require significant data downloads or take time to train. If you find that an example is no longer running, please tell us! We know there's nothing worse than trying to figure out what you're doing wrong, and it turns out your code was never the problem.