Tuning Hyperparameters: Learning Rate, Regularization Strength, Batch Size

Was this section helpful?

References

Deep Learning, Ian Goodfellow, Yoshua Bengio, and Aaron Courville, 2016 (MIT Press) - A foundational textbook providing comprehensive coverage of deep learning, including detailed explanations of hyperparameter roles and tuning strategies.
A disciplined approach to neural network hyper-parameters: Part 1 -- learning rate, batch size, momentum, and weight decay, Leslie N. Smith, 2018 arXiv preprint arXiv:1803.09820 (US Naval Research Laboratory) DOI: 10.48550/arXiv.1803.09820 - This paper provides practical guidelines and strategies for tuning critical hyperparameters such as learning rate, batch size, and weight decay, directly relevant to the section's content.
CS231n Convolutional Neural Networks for Visual Recognition, Lecture Notes: Training Neural Networks, Part 3, Stanford University, 2023 - A highly regarded online resource providing practical guidance on hyperparameter tuning, including learning rates, regularization, and batch size, in the context of deep learning.