Mixed-Precision Training, Paulius Micikevicius, Sharan Narang, Jonah Alben, Gregory Diamos, Erich Elsen, David Garcia, Boris Ginsburg, Michael Houston, Oleksii Kuchaiev, Ganesh Venkatesh, Hao Wu, 2018International Conference on Learning Representations (ICLR)DOI: 10.48550/arXiv.1710.03740 - Introduces mixed-precision training, detailing FP16 range issues (underflow/overflow) and the proposed solution of loss scaling.
A New Standard for Mixed-Precision Training: bfloat16, Karen Young, David Patterson, Cliff Young, 2019 (Google AI Blog) - Explains the BFloat16 format, highlighting its wider dynamic range compared to FP16, which directly addresses underflow and overflow concerns in mixed-precision training.
Computer Architecture: A Quantitative Approach, John L. Hennessy, David A. Patterson, 2017 (Morgan Kaufmann) - A fundamental textbook providing a complete explanation of floating-point arithmetic, including the IEEE 754 standard, bit allocations, and the numerical properties of various formats like FP16 and FP32.
Automatic Mixed Precision for Deep Learning, PyTorch Documentation, 2024 - Official PyTorch guide explaining mixed-precision training, including how framework-level solutions like torch.cuda.amp address the numerical stability challenges posed by FP16.