Speech Recognition with Weighted Finite-State Transducers, Mehryar Mohri, Fernando Pereira, Michael Riley, 2002Proceedings of the 2002 IEEE Workshop on Machine Learning for Signal Processing (IEEE)DOI: 10.1109/MLSP.2002.1026040 - Foundational paper detailing the use of Weighted Finite-State Transducers for building and decoding speech recognition systems.
Sequence Transduction with Recurrent Neural Networks, Alex Graves, 2012Proceedings of the International Conference of Machine Learning (ICML) 2012 Workshop on Representation Learning, Vol. 27 - Presents the Recurrent Neural Network Transducer (RNN-T) architecture, offering a framework for directly mapping input sequences to output sequences without explicit alignments, crucial for streaming ASR.