Deep Learning, Ian Goodfellow, Yoshua Bengio, Aaron Courville, 2016 (MIT Press) - A standard textbook that explains L2 regularization's mathematical formulation and the 'weight decay' effect within the context of deep learning.
Pattern Recognition and Machine Learning, Christopher M. Bishop, 2006 (Springer) - Offers a comprehensive statistical treatment of L2 regularization, covering its origins and general principles in machine learning.
Neural Networks and Deep Learning, Michael Nielsen, 2019 - An online resource providing a clear and accessible explanation of L2 regularization, its mathematical details, and the 'weight decay' mechanism.
CS231n: Convolutional Neural Networks for Visual Recognition - Regularization, Andrej Karpathy, Fei-Fei Li, Justin Johnson, Serena Yeung, et al., 2023 - Lecture notes from a renowned university course, detailing the practical and mathematical aspects of L2 regularization and weight decay in neural networks.