BLEU: a Method for Automatic Evaluation of Machine Translation, Kishore Papineni, Salim Roukos, Todd Ward, and Wei-Jing Zhu, 2002Proceedings of the 40th Annual Meeting of the Association for Computational Linguistics (Association for Computational Linguistics)DOI: 10.3115/1073083.1073135 - Introduces the BLEU score, a foundational metric for automatic evaluation of machine translation, based on n-gram overlap.
ROUGE: A Package for Automatic Evaluation of Summaries, Chin-Yew Lin, 2004Text Summarization Branches Out (Association for Computational Linguistics)DOI: 10.3115/1621876.1621880 - Presents the ROUGE metric suite, widely used for evaluating the quality of summaries and other text generations by comparing them to reference texts.