Distributed Dense Retrieval: Implementations and Optimizations
Was this section helpful?
Milvus: A Purpose-Built Vector Data Management System, Xiangyu Wang, Xiaofan Luan, Cong Fu, Hao Xu, Xiaomeng Zhang, Guotao Cheng, Mengya Yuan, Shaohua Wang, Shengjun Gong, Jianling Ding, Junbo Yang, Bo Yang, Renjie Huang, Jinrui Cao, Yigong Wang, Jianying Su, Fan Jia, Wei Li, Xiang Li, Xiaoyu Wang, 2021Proceedings of the VLDB Endowment (PVLDB), Vol. 14 (VLDB Endowment)DOI: 10.14778/3476310.3476340 - Presents the architecture and design of Milvus, an open-source, cloud-native vector database system explicitly designed for scalable and distributed similarity search, directly relevant to the section's content.
DistilBERT, a distilled version of BERT: smaller, faster, cheaper and lighter, Victor Sanh, Lysandre Debut, Julien Chaumond, Thomas Wolf, 2019Workshop on Energy Efficient Machine Learning and Cognitive Computing - NeurIPS 2019DOI: 10.48550/arXiv.1910.01108 - Describes a method for distilling large transformer models into smaller, more efficient versions, a key optimization strategy for reducing latency and cost in distributed dense retrieval systems.