Long Short-Term Memory, Sepp Hochreiter, Jürgen Schmidhuber, 1997Neural Computation, Vol. 9 (MIT Press)DOI: 10.1162/neco.1997.9.8.1735 - Introduces the foundational Long Short-Term Memory (LSTM) recurrent neural network architecture, detailing its internal mechanisms, including gates and cell states.
Deep Learning, Ian Goodfellow, Yoshua Bengio, and Aaron Courville, 2016 (MIT Press) - Offers extensive theoretical coverage of recurrent neural networks, LSTMs, and GRUs, including their architectures, training, and parameter considerations.
tf.keras.layers.LSTM and tf.keras.layers.GRU, TensorFlow Developers, 2023 - Official documentation describing the parameters and use of LSTM and GRU layers in TensorFlow/Keras, directly supporting the code examples.
torch.nn.LSTM and torch.nn.GRU, PyTorch Core Team, 2023 (PyTorch Foundation) - Official documentation for PyTorch's LSTM and GRU module implementations, explaining their constructors, parameters, and return values.