Release It! Design and Deploy Production-Ready Software, Michael T. Nygard, 2018 (The Pragmatic Programmers) - Focuses on resilience patterns, including circuit breakers, timeouts, and retries, offering practical strategies for building fault-tolerant systems and handling failures gracefully in complex production environments.
AWS Well-Architected Framework: Reliability Pillar, Amazon Web Services, 2024 (Amazon Web Services) - An official guide outlining best practices for designing and operating reliable workloads in the cloud, covering multi-AZ/multi-region deployments, disaster recovery, and operational excellence for high availability.
A Survey on Vector Database Management Systems, Xuanhe Zhou, Haopeng Wang, Yonggang Wen, Hanlin Zhang, Yuwei Wu, Jiaheng Lu, Zhiqiang Xu, 2023arXiv preprintDOI: 10.48550/arXiv.2307.03118 - Provides a comprehensive overview of the architectures, features, and challenges of various vector database management systems, including discussions on data replication and scalability strategies essential for their high availability.