Model-Agnostic Meta-Learning for Fast Adaptation of Deep Networks, Chelsea Finn, Pieter Abbeel, and Sergey Levine, 2017Proceedings of the 34th International Conference on Machine Learning, Vol. 70 (PMLR)DOI: 10.5591/978-1-57766-810-3 - A foundational paper introducing MAML, which is a core meta-learning algorithm. It helps understand adaptation performance and computational aspects.
Language Models are Few-Shot Learners, Tom B. Brown, Benjamin Mann, Nick Ryder, Melanie Subbiah, Jared Kaplan, Prafulla Dhariwal, Arvind Neelakantan, Pranav Shyam, Girish Sastry, Amanda Askell, Sandhini Agarwal, Ariel Herbert-Voss, Gretchen Krueger, Tom Henighan, Rewon Child, Anna Ramesh, Daniel M. Ziegler, Jeff Wu, Clemens Winter, Christopher Hesse, Mark Chen, Eric Sigler, Mateusz Litwin, Scott Gray, Benjamin Chess, Jack Clark, Christopher Berner, Sam McCandlish, Alec Radford, Ilya Sutskever, and Dario Amodei, 2020Advances in Neural Information Processing Systems, Vol. 33 (Curran Associates, Inc.)DOI: 10.48550/arXiv.2005.14165 - Presents GPT-3, a large foundation model, discussing its few-shot learning abilities and evaluation at scale. Provides context for benchmarking foundation models.
LoRA: Low-Rank Adaptation of Large Language Models, Edward J. Hu, Yelong Shen, Phillip Wallis, Zeyuan Allen-Zhu, Yuanzhi Li, Shean Wang, Lu Wang, and Weizhu Chen, 2022International Conference on Learning Representations (ICLR)DOI: 10.48550/arXiv.2106.09685 - Introduces LoRA, a parameter-efficient fine-tuning (PEFT) method, which is a key baseline for efficiency comparisons in scalable implementations.
Measuring and Improving Reproducibility in Deep Reinforcement Learning, Lena von Kügelgen, David W. Zhang, Andrew L. Maas, Daniel S. Takeshita, Moritz Knoth, Alex Hernandez-Garcia, Stefan Preuss, and Alexia J. Zou, 2020International Conference on Learning Representations (ICLR)DOI: 10.48550/arXiv.1909.06830 - Discusses the importance of reproducibility and robust experimental design in deep learning, offering insights for setting up reliable benchmarking experiments.