Long Short-Term Memory, Sepp Hochreiter, Jürgen Schmidhuber, 1997Neural Computation, Vol. 9 (MIT Press)DOI: 10.1162/neco.1997.9.8.1735 - Introduces the Long Short-Term Memory (LSTM) architecture, including the foundational concept of gates for managing memory.
Deep Learning, Ian Goodfellow, Yoshua Bengio, and Aaron Courville, 2016 (MIT Press) - Provides a comprehensive explanation of Recurrent Neural Networks and LSTMs, detailing the function and operation of gates.
Recurrent Neural Networks and LSTMs (CS224n Lecture Notes), Abigail See, Christopher Manning, 2021 (Stanford University) - Provides detailed academic course material explaining the architecture and operation of LSTMs, including the forget gate.