Gradient-based learning applied to document recognition, Yann LeCun, Léon Bottou, Yoshua Bengio, Patrick Haffner, 1998Proceedings of the IEEE, Vol. 86 (IEEE)DOI: 10.1109/5.726791 - Introduces the foundational concepts of Convolutional Neural Networks (CNNs), including local receptive fields, shared weights, and pooling, which are essential for Convolutional Autoencoders.
Deep Learning, Ian Goodfellow, Yoshua Bengio, Aaron Courville, 2016 (MIT Press) - A comprehensive textbook providing in-depth theoretical foundations for both Convolutional Neural Networks and Autoencoders, covering architectures, training, and various forms of each.
Stacked Convolutional Auto-Encoders for Hierarchical Feature Extraction, Jonathan Masci, Ueli Meier, Dan Cireşan, Jürgen Schmidhuber, 2011International Conference on Artificial Neural Networks (ICANN), Vol. 6791 (Springer, Berlin, Heidelberg)DOI: 10.1007/978-3-642-21735-7_7 - A seminal paper that explicitly introduces the architecture and application of Convolutional Autoencoders for learning hierarchical features from image data.
CS231n: Convolutional Neural Networks for Visual Recognition, Fei-Fei Li, Ehsan Adeli, Justin Johnson, Zane Durante, 2025 (Stanford University) - Provides detailed explanations and visualizations of Convolutional Neural Networks, pooling, strided convolutions, and transposed convolutions, which are fundamental components of CAEs.