Site Reliability Engineering: How Google Runs Production Systems, Betsy Beyer, Chris Jones, Jennifer Petoff, Niall Richard Murphy, 2016 (O'Reilly Media) - Offers core SRE principles and practices, foundational for managing production systems, including incident response and postmortems.
The Site Reliability Workbook: Practical Ways to Implement SRE, Betsy Beyer, Niall Richard Murphy, David K. Rensin, Kent Kawahara, Stephen Thorne, 2018 (O'Reilly Media) - Provides practical advice and examples for implementing SRE, covering incident management, on-call, and operational documentation.