Model-Agnostic Meta-Learning for Fast Adaptation of Deep Networks, Chelsea Finn, Pieter Abbeel, and Sergey Levine, 2017Proceedings of the 34th International Conference on Machine Learning (ICML), Vol. 70 (PMLR (Proceedings of Machine Learning Research))DOI: 10.5555/3305890.3306019 - Introduces Model-Agnostic Meta-Learning (MAML), a foundational gradient-based meta-learning algorithm whose computational and memory demands for second-order derivatives are a central challenge for scaling to foundation models.
On First-Order Meta-Learning Algorithms, Alex Nichol, Joshua Achiam, and John Schulman, 2018arXiv preprint arXiv:1803.02999 (arXiv) - Presents Reptile, a first-order meta-learning algorithm that offers a more computationally efficient alternative to MAML, making it more practical for large-scale applications, directly addressing scalability.
LoRA: Low-Rank Adaptation of Large Language Models, Edward J. Hu, Yelong Shen, Phillip Wallis, Zeyuan Allen-Zhu, Yuanzhi Li, Shean Wang, Lu Wang, and Weizhu Chen, 2021International Conference on Learning Representations (ICLR 2022) (OpenReview.net)DOI: 10.48550/arXiv.2106.09685 - Introduces Low-Rank Adaptation (LoRA), a prominent parameter-efficient fine-tuning technique that reduces the number of trainable parameters, which is essential for parameter-efficient meta-learning on foundation models.
ZeRO: Memory Optimizations Toward Training Trillion Parameter Models, Samyam Rajbhandari, Cong Guo, Jeff Rasley, Shaden Smith, and Yuxiong He, 2020Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis (SC) (IEEE)DOI: 10.1109/SC41405.2020.00078 - Details ZeRO, a memory-efficient optimizer for large-scale distributed training, directly addressing the significant memory footprint and computational demands of foundation models, which is crucial for scalable meta-learning.