spaCy/examples/training
Matthew Honnibal 333b1a308b
Adapt parser and NER for transformers (#5449)
* Draft layer for BILUO actions

* Fixes to biluo layer

* WIP on BILUO layer

* Add tests for BILUO layer

* Format

* Fix transitions

* Update test

* Link in the simple_ner

* Update BILUO tagger

* Update __init__

* Import simple_ner

* Update test

* Import

* Add files

* Add config

* Fix label passing for BILUO and tagger

* Fix label handling for simple_ner component

* Update simple NER test

* Update config

* Hack train script

* Update BILUO layer

* Fix SimpleNER component

* Update train_from_config

* Add biluo_to_iob helper

* Add IOB layer

* Add IOBTagger model

* Update biluo layer

* Update SimpleNER tagger

* Update BILUO

* Read random seed in train-from-config

* Update use of normal_init

* Fix normalization of gradient in SimpleNER

* Update IOBTagger

* Remove print

* Tweak masking in BILUO

* Add dropout in SimpleNER

* Update thinc

* Tidy up simple_ner

* Fix biluo model

* Unhack train-from-config

* Update setup.cfg and requirements

* Add tb_framework.py for parser model

* Try to avoid memory leak in BILUO

* Move ParserModel into spacy.ml, avoid need for subclass.

* Use updated parser model

* Remove incorrect call to model.initializre in PrecomputableAffine

* Update parser model

* Avoid divide by zero in tagger

* Add extra dropout layer in tagger

* Refine minibatch_by_words function to avoid oom

* Fix parser model after refactor

* Try to avoid div-by-zero in SimpleNER

* Fix infinite loop in minibatch_by_words

* Use SequenceCategoricalCrossentropy in Tagger

* Fix parser model when hidden layer

* Remove extra dropout from tagger

* Add extra nan check in tagger

* Fix thinc version

* Update tests and imports

* Fix test

* Update test

* Update tests

* Fix tests

* Fix test

Co-authored-by: Ines Montani <ines@ines.io>
2020-05-18 22:23:33 +02:00
..
ner_example_data Updates/bugfixes for NER/IOB converters (#4186) 2019-08-29 12:04:01 +02:00
textcat_example_data Add textcat to train CLI (#4226) 2019-09-15 22:31:31 +02:00
conllu-config.json Generalize handling of tokenizer special cases (#4259) 2019-11-13 21:24:35 +01:00
conllu.py Merge branch 'master' into develop 2019-12-21 18:55:03 +01:00
ner_multitask_objective.py Example class for training data (#4543) 2019-11-11 17:35:27 +01:00
pretrain_kb.py Friendly error warning for NEL example script (#4881) 2020-01-14 01:51:14 +01:00
pretrain_textcat.py Default settings to configurations (#4995) 2020-02-27 18:42:27 +01:00
rehearsal.py Merge branch 'master' into develop 2020-02-18 14:47:23 +01:00
train_entity_linker.py Merge branch 'master' into develop 2020-02-18 14:47:23 +01:00
train_intent_parser.py Merge branch 'master' into develop 2020-02-18 14:47:23 +01:00
train_morphologizer.py Update morphologizer (#5108) 2020-04-02 14:46:32 +02:00
train_ner.py Adapt parser and NER for transformers (#5449) 2020-05-18 22:23:33 +02:00
train_new_entity_type.py Merge branch 'master' into develop 2020-02-18 14:47:23 +01:00
train_parser.py Merge branch 'master' into develop 2020-02-18 14:47:23 +01:00
train_tagger.py Example class for training data (#4543) 2019-11-11 17:35:27 +01:00
train_textcat_config.cfg Train textcat with config (#5143) 2020-03-29 19:40:36 +02:00
train_textcat.py Train textcat with config (#5143) 2020-03-29 19:40:36 +02:00
training-data.json Revert training example edit from #4327 (#4403) 2019-10-10 17:00:26 +02:00
vocab-data.jsonl Use even smaller examle size 2017-10-30 19:46:45 +01:00