spaCy/spacy/pipeline
Matthew Honnibal 3e78e82a83
Experimental character-based pretraining (#5700)
* Use cosine loss in Cloze multitask

* Fix char_embed for gpu

* Call resume_training for base model in train CLI

* Fix bilstm_depth default in pretrain command

* Implement character-based pretraining objective

* Use chars loss in ClozeMultitask

* Add method to decode predicted characters

* Fix number characters

* Rescale gradients for mlm

* Fix char embed+vectors in ml

* Fix pipes

* Fix pretrain args

* Move get_characters_loss

* Fix import

* Fix import

* Mention characters loss option in pretrain

* Remove broken 'self attention' option in pretrain

* Revert "Remove broken 'self attention' option in pretrain"

This reverts commit 56b820f6af.

* Document 'characters' objective of pretrain
2020-07-05 15:48:39 +02:00
..
__init__.py Merge changes from master 2019-08-21 14:18:52 +02:00
entityruler.py Tidy up and auto-format 2020-03-25 12:28:12 +01:00
functions.py Filter subtoken matches in merge_subtokens() (#4539) 2019-10-28 15:40:28 +01:00
hooks.py Component decorator and component analysis (#4517) 2019-10-27 13:35:49 +01:00
morphologizer.pyx Component decorator and component analysis (#4517) 2019-10-27 13:35:49 +01:00
pipes.pyx Experimental character-based pretraining (#5700) 2020-07-05 15:48:39 +02:00