Speech and Language Processing: An Introduction to Computational Linguistics and Speech Recognition, Daniel Jurafsky and James H. Martin, 2025 (Pearson) - This comprehensive textbook covers the principles and practices of speech and language processing, detailing ASR system components, feature extraction, acoustic and language modeling, and decoding algorithms. It also includes modern neural network techniques.
Deep Neural Networks for Acoustic Modelling in Speech Recognition: The Shared Views of Four Research Groups, Geoffrey Hinton, Li Deng, Dong Yu, George Dahl, Adam Mohamed, Navdeep Jaitly, Andrew Senior, Vincent Vanhoucke, Patrick Nguyen, Tara Sainath, and Brian Kingsbury, 2012IEEE Signal Processing Magazine, Vol. 29 (IEEE)DOI: 10.1109/MSP.2012.2205597 - This paper introduced deep neural networks for acoustic modeling, significantly improving speech recognition performance over Gaussian Mixture Models and influencing the design of modern ASR systems.
Speech Recognition with Deep Recurrent Neural Networks, Alex Graves, Abdel-rahman Mohamed, and Geoffrey Hinton, 2013IEEE Conference Publication (IEEE)DOI: 10.1109/ICASSP.2013.6638947 - Introduces recurrent neural network transducers (RNN-T) for end-to-end speech recognition, demonstrating an architecture that directly maps acoustic features to character sequences without explicit alignment.