Learning Phrase Representations using RNN Encoder-Decoder for Statistical Machine Translation, Kyunghyun Cho, Bart van Merrienboer, Caglar Gulcehre, Dzmitry Bahdanau, Fethi Bougares, Holger Schwenk, Yoshua Bengio, 2014Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP)DOI: 10.48550/arXiv.1406.1078 - Introduces the Gated Recurrent Unit (GRU) architecture, detailing its structure and function as a more computationally efficient alternative to LSTMs for sequence modeling tasks.
tf.keras.layers.GRU, TensorFlow Developers, 2023 (TensorFlow) - Official documentation for Keras GRU layers in TensorFlow, providing API details, parameters, and examples for implementation.
torch.nn.GRU, PyTorch Developers, 2023 (PyTorch) - Official documentation for the PyTorch GRU module, outlining its parameters, inputs, outputs, and usage for sequence modeling.
Deep Learning, Ian Goodfellow, Yoshua Bengio, and Aaron Courville, 2016 (MIT Press) - A foundational textbook covering recurrent neural networks, LSTMs, and GRUs, including their theoretical underpinnings and comparisons.