Deep Learning, Ian Goodfellow, Yoshua Bengio, and Aaron Courville, 2016 (MIT Press) - A fundamental textbook that covers optimization algorithms, including the theoretical basis and practical difficulties of Batch Gradient Descent in deep learning.
Neural Networks Part 3: Learning and Evaluation, Andrej Karpathy, Justin Johnson, and Fei-Fei Li, 2023 (Stanford University) - Stanford CS231n course notes offer practical observations on the challenges of training deep neural networks, including the computational demands and memory limits of Batch Gradient Descent, and the complexity of the loss surface.
The Loss Landscape of Neural Networks, Anna Choromanska, Mikael Henaff, Michael Mathieu, Gerard Ben Arous, Yann LeCun, 2015Proceedings of the Eighteenth International Conference on Artificial Intelligence and Statistics, Vol. 38 (PMLR) - Presents theoretical and empirical examination of the loss landscape of neural networks, suggesting that in high dimensions, local minima often resemble the global minimum, but saddle points are a more significant problem.