Matching Networks for One Shot Learning, Oriol Vinyals, Charles Blundell, Timothy Lillicrap, Koray Kavukcuoglu, Daan Wierstra, 2016Advances in Neural Information Processing Systems, Vol. 29 (Curran Associates, Inc.)DOI: 10.5555/3157382.3157463 - Introduces Matching Networks, Full Contextual Embeddings (FCE), and their application to one-shot learning.
Attention Is All You Need, Ashish Vaswani, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones, Aidan N. Gomez, Lukasz Kaiser, Illia Polosukhin, 2017Advances in Neural Information Processing Systems (NeurIPS)DOI: 10.48550/arXiv.1706.03762 - Introduces the Transformer architecture and the scaled dot-product attention mechanism, relevant for advanced attention functions.