Handling Highly Dynamic and Streaming Data Sources
Was this section helpful?
Designing Data-Intensive Applications, Martin Kleppmann, 2017 (O'Reilly Media) - A foundational book on distributed systems that covers principles relevant to high-velocity data ingestion, stream processing, and various database architectures including those using Log-Structured Merge (LSM) trees, providing a comprehensive understanding of building scalable and resilient data systems.
The Log-Structured Merge-Tree (LSM-Tree), Patrick O'Neil, Edward Cheng, Dieter Gawlick, Elizabeth O'Neil, 1996Acta Informatica, Vol. 33 (Springer)DOI: 10.1007/s002360050048 - The original academic paper introducing the Log-Structured Merge-Tree (LSM-Tree) data structure, which is fundamental to many modern NoSQL databases and increasingly vector databases for efficient high-throughput writes and incremental updates.
Debezium Documentation, The Debezium Community, 2023 - The official documentation for Debezium, an open-source platform for Change Data Capture (CDC), detailing its architecture, connectors, and usage for streaming database changes, which is crucial for synchronizing RAG systems with frequently updated operational data stores.