Calculus: Early Transcendentals, James Stewart, 2015 (Cengage Learning) - Offers a comprehensive treatment of single and multivariable calculus, with a detailed explanation of the chain rule as a fundamental concept for derivatives of composite functions.
Deep Learning, Ian Goodfellow, Yoshua Bengio, and Aaron Courville, 2016 (MIT Press) - The authoritative textbook on deep learning, providing a detailed and rigorous mathematical derivation of the backpropagation algorithm, explicitly demonstrating the chain rule's application for gradient computation.
Neural Networks and Deep Learning, Michael Nielsen, 2019 - An accessible online textbook that offers an intuitive and step-by-step explanation of the backpropagation algorithm, with a clear focus on how the chain rule is applied to calculate gradients.
Backpropagation, Intuitions (CS231n: Convolutional Neural Networks for Visual Recognition), Andrej Karpathy, Justin Johnson, Serena Yeung, and Feifei Li, 2023Stanford University CS231n Course Notes - Stanford's widely referenced course notes providing a practical and intuitive explanation of backpropagation, emphasizing the role of the chain rule in efficiently computing gradients for neural networks.