WaveNet: A Generative Model for Raw Audio, Aaron van den Oord, Sander Dieleman, Heiga Zen, Karen Simonyan, Oriol Vinyals, Alex Graves, Nal Kalchbrenner, Andrew Senior, Koray Kavukcuoglu, 2016arXiv preprint arXiv:1609.03499 (arXiv)DOI: 10.48550/arXiv.1609.03499 - Introduces a foundational neural vocoder and discusses the necessity of perceptual evaluation for high-quality synthetic speech.