Temporal Difference Learning Methods

Was this section helpful?

References

Reinforcement Learning: An Introduction, Richard S. Sutton and Andrew G. Barto, 2018 (MIT Press) - A widely recognized textbook for reinforcement learning, offering extensive details on TD(0), SARSA, and Q-Learning.
Learning to Predict by Methods of Temporal Differences, Richard S. Sutton, 1988 Machine Learning, Vol. 3 (Springer) DOI: 10.1007/BF00115009 - The original paper introducing temporal difference learning, laying the groundwork for subsequent TD methods.
On-line Q-learning using Sarsa with experience replay, Gavin Adrian Rummery, Mahesan Niranjan, 1994 (Department of Engineering, University of Cambridge) - A technical report that first detailed the SARSA algorithm for on-policy temporal difference control.