Deep Learning, Ian Goodfellow, Yoshua Bengio, and Aaron Courville, 2016 (MIT Press) - Provides a comprehensive academic treatment of neural network optimization, including the role and impact of learning rate in gradient descent.
Neural Networks and Deep Learning, Michael A. Nielsen, 2015 (Determination Press) - An accessible online book explaining the basics of neural networks, including a clear explanation of gradient descent and the importance of the learning rate.
Optimization: Stochastic Gradient Descent, Andrej Karpathy, Justin Johnson, and Li Fei-Fei, 2023Stanford CS231n Course Notes - Part of a highly regarded university course, these notes offer practical and intuitive explanations of optimization algorithms like SGD, focusing on the learning rate.