A Tutorial on Thompson Sampling, Daniel J. Russo, Benjamin Van Roy, Abbas Kazerouni, Ian Osband, Zheng Wen, 2018Foundations and Trends in Machine Learning, Vol. 11 (Now Publishers)DOI: 10.1561/2200000070 - A comprehensive tutorial covering the foundations, theory, and applications of Thompson Sampling in various contexts including reinforcement learning.
Deep Exploration via Bootstrapped DQN, Ian Osband, Charles Blundell, Alexander Pritzel, Benjamin Van Roy, 2016Advances in Neural Information Processing Systems, Vol. 29 (NeurIPS) - Introduces Bootstrapped DQN, a practical and widely used approximation of Thompson Sampling for deep reinforcement learning.