Deep Learning, Ian Goodfellow, Yoshua Bengio, and Aaron Courville, 2016 (MIT Press) - Provides a comprehensive explanation of activation functions, including the Sigmoid, its properties, historical context, and the problem of vanishing gradients in deep networks.
Lecture Notes on Neural Networks Part 1: Setting up the Architecture, Andrej Karpathy, Justin Johnson, and Serena Yeung, 2023Stanford University CS231n Course Notes - Offers an accessible introduction to neural network architectures, covering activation functions like Sigmoid, their advantages, and critical drawbacks such as vanishing gradients and non-zero-centered outputs.
torch.sigmoid, PyTorch Developers, 2023 (PyTorch Foundation) - Official documentation for the Sigmoid activation function within the PyTorch deep learning framework, detailing its functional and module-based usage for practical implementation.