Gradient-Based Learning Applied to Document Recognition, Yann LeCun, Léon Bottou, Yoshua Bengio, and Patrick Haffner, 1998Proceedings of the IEEE, Vol. 86 (IEEE)DOI: 10.1109/5.726791 - Presents one of the first successful CNN architectures, LeNet-5, demonstrating convolutional and pooling layers for optical character recognition.
Deep Learning, Ian Goodfellow, Yoshua Bengio, and Aaron Courville, 2016 (MIT Press) - A standard textbook offering extensive coverage of convolutional networks, from their theoretical underpinnings to architectural designs.
Convolutional Neural Networks for Visual Recognition (CS231n) Lecture Notes, Fei-Fei Li, Justin Johnson, and Serena Yeung, 2017 (Stanford University) - Provides detailed lecture notes and explanations on convolutional neural networks, covering their architecture, common layers, and practical considerations.
Deep Sparse Rectifier Neural Networks, Xavier Glorot, Antoine Bordes, Yoshua Bengio, 2011Proceedings of the Fourteenth International Conference on Artificial Intelligence and Statistics (AISTATS), Vol. 15 (PMLR) - Introduces and analyzes the use of Rectified Linear Units (ReLUs) as activation functions, highlighting their advantages in training deep networks.
torch.nn.Sequential, PyTorch Developers, 2024 (PyTorch Foundation) - Official documentation for PyTorch's nn.Sequential module and related layers (e.g., Conv2d, MaxPool2d, Flatten, Linear), which are used for building CNNs.