Distilling the Knowledge in a Neural Network, Geoffrey Hinton, Oriol Vinyals, Jeff Dean, 2015arXiv preprint arXiv:1503.02531DOI: 10.48550/arXiv.1503.02531 - The original paper introducing knowledge distillation, a technique for creating smaller, faster models from larger ones, relevant for understanding distillation trade-offs.