Overview
Explore the evolving landscape of system resilience in this 25-minute conference talk by Nagarjuna Malladi at Conf42 DevOps 2025. Discover how DevOps and Site Reliability Engineering (SRE) are converging to create more robust systems. Learn about synergetic SRE practices, cloud native infrastructure implementation, and essential SRE strategies that can transform operational reliability. Examine critical metrics for measuring system health, automation techniques for reducing toil, and how AI and machine learning are revolutionizing the SRE field. Gain insights into industry-specific considerations and emerging trends that will shape the future of system resilience. The presentation provides a comprehensive overview from introduction to practical takeaways, making it valuable for both newcomers and experienced professionals in the DevOps and SRE domains.
Syllabus
00:00 Introduction and Welcome
00:43 DevOps and SRE Convergence
02:20 Synergetic Practices in SRE
03:56 Cloud Native Infrastructure
06:14 Key SRE Strategies
09:59 Metrics That Matter
12:17 Automation in SRE
14:48 AI and Machine Learning in SRE
18:13 SRE Best Practices
21:05 Industry-Specific Considerations
22:11 Future Trends in SRE
23:45 Key Takeaways and Conclusion
Taught by
Conf42