Overcoming catastrophic forgetting in neural networks, James Kirkpatrick, Razvan Pascanu, Neil C. Rabinowitz, John P. Lillicrap, Peter Moreton, Gregory Sidorov, Niru Mahendran, Richard Hadsell, Demis Hassabis, 2017Proceedings of the National Academy of Sciences, Vol. 114 (National Academy of Sciences)DOI: 10.1073/pnas.1611835114 - Introduces Elastic Weight Consolidation (EWC), a regularization method to mitigate catastrophic forgetting by identifying and protecting parameters important for previous tasks.
A Comprehensive Survey of Continual Learning: Theory, Methodology, and Application, Liyuan Wang, Xingxing Zhang, Maobin Li, Ling Luo, Jianchao Tan, Ruifeng Yuan, Jian Yang, Cong Chen, Xiaofeng Zhang, 2023arXiv preprint arXiv:2303.14814DOI: 10.48550/arXiv.2303.14814 - Offers a broad overview of continual learning, covering diverse strategies, challenges, and applications across various domains, including discussions relevant to sequential adaptation of LLMs.
LoRA: Low-Rank Adaptation of Large Language Models, Edward J. Hu, Yelong Shen, Phillip Wallis, Zeyuan Allen-Zhu, Yuanzhi Li, Shean Wang, Lu Wang, Weizhu Chen, 2021arXivDOI: 10.48550/arXiv.2106.09685 - Introduces Low-Rank Adaptation (LoRA), a parameter-efficient fine-tuning technique highly relevant for adapting LLMs in sequential learning scenarios by adding small, task-specific modules.