Mixed-Precision Training, Paulius Micikevicius, Sharan Narang, Jonah Alben, Gregory Diamos, Erich Elsen, David Garcia, Boris Ginsburg, Michael Houston, Oleksii Kuchaiev, Ganesh Venkatesh, Hao Wu, 2018International Conference on Learning Representations (ICLR)DOI: 10.48550/arXiv.1710.03740 - The foundational paper introducing mixed-precision training with FP16 and dynamic loss scaling, explaining its benefits and practical challenges.
Automatic Mixed Precision examples, PyTorch Contributors, 2024 - Official PyTorch documentation providing examples and guidelines for implementing and debugging mixed-precision training using torch.cuda.amp.GradScaler.