Distilling the Knowledge in a Neural Network, Geoffrey Hinton, Oriol Vinyals, Jeff Dean, 2015arXiv preprint arXiv:1503.02531DOI: 10.48550/arXiv.1503.02531 - Introduces the foundational concept of knowledge distillation, including the use of soft targets and temperature scaling for transferring knowledge.
Knowledge Distillation: A Survey, Jianping Gou, Baoshan Zhang, Yuxi Li, Wenqiang Liang, Dengyu Li, Jin Tang, 2021Neurocomputing, Vol. 448 (Elsevier)DOI: 10.1016/j.neucom.2021.03.042 - Comprehensive review of knowledge distillation techniques, covering various methods, applications, and research directions.