* Add the configuration schema for distillation
This also adds the default configuration and some tests. The schema will
be used by the training loop and `distill` subcommand.
* Format
* Change distillation shortopt to -d
* Fix descripion of max_epochs
* Rename distillation flag to -dt
* Rename `pipe_map` to `student_to_teacher`