Deep Learning, Ian Goodfellow, Yoshua Bengio, and Aaron Courville, 2016 (MIT Press) - A comprehensive and authoritative textbook covering the theoretical foundations and practical aspects of deep learning, including a discussion of Batch Normalization in the context of optimization.
Dive into Deep Learning, Aston Zhang, Zachary C. Lipton, Mu Li, Alex Smola, 2023 (Cambridge University Press) - An interactive, open-source textbook that provides detailed explanations and step-by-step mathematical derivations for various deep learning components, including a dedicated section on Batch Normalization's forward and backward passes.