spaCy/spacy/ml
Matthew Honnibal 3e78e82a83
Experimental character-based pretraining (#5700)
* Use cosine loss in Cloze multitask

* Fix char_embed for gpu

* Call resume_training for base model in train CLI

* Fix bilstm_depth default in pretrain command

* Implement character-based pretraining objective

* Use chars loss in ClozeMultitask

* Add method to decode predicted characters

* Fix number characters

* Rescale gradients for mlm

* Fix char embed+vectors in ml

* Fix pipes

* Fix pretrain args

* Move get_characters_loss

* Fix import

* Fix import

* Mention characters loss option in pretrain

* Remove broken 'self attention' option in pretrain

* Revert "Remove broken 'self attention' option in pretrain"

This reverts commit 56b820f6af.

* Document 'characters' objective of pretrain
2020-07-05 15:48:39 +02:00
..
__init__.py Tidy up and auto-format 2019-10-28 12:43:55 +01:00
_legacy_tok2vec.py Experimental character-based pretraining (#5700) 2020-07-05 15:48:39 +02:00
_wire.py Refactor Tok2Vec to use architecture registry (#4518) 2019-10-25 22:28:20 +02:00
common.py Replace function registries with catalogue (#4584) 2019-11-07 11:45:22 +01:00
tok2vec.py Replace function registries with catalogue (#4584) 2019-11-07 11:45:22 +01:00