Course Overview
The SRE Foundation course introduces a range of practices for improving service reliability through a mixture of automation, working methods and organizational re-alignment. Tailored for those focused on large-scale service availability. Site Reliability Engineering is a term that is quickly growing to prominence, mainly because it is the main operating model for IT Service Management at Google. From around 2016 onwards, Google started with the creation of SRE teams to manage production systems.
The SRE Foundation (Site Reliability Engineering) course is an introduction to the principles and practices that enable an organization to reliably and economically scale critical services. Introducing a site-reliability dimension requires organizational re-alignment, a new focus on engineering and automation, and the adoption of a range of new working paradigms.
The course highlights the evolution of Site Reliability Engineering and its future direction and equips participants with the practices, methods, and tools to engage people across the organization involved in reliability and stability, evidenced through the use of real-life scenarios and case stories. Upon completion of the course, participants will have tangible takeaways to leverage when back in the office, such as understanding, setting, and tracking Service Level Objectives (SLO’s).
The course was developed by leveraging key SRE sources, engaging with thought-leaders in the SRE space, and working with organizations embracing SRE to extract real-life best practices and has been designed to teach the key principles & practices necessary for starting SRE adoption. Site Reliability Engineering embraces most of the fundamental concepts – such as automation, collaboration, and quality – of DevOps, but augments it with some more measurable targets, which are especially relevant in an enterprise context.
This course positions learners to successfully complete the SRE Foundation certification exam.
Learning Objectives
The learning objectives for the SRE Foundation course include a practical understanding of:
- The history of SRE and its emergence at Google
- The inter-relationship of SRE with DevOps and other popular frameworks
- The underlying principles behind SRE
- Service Level Objectives (SLO’s) and their user focus
- Service Level Indicators (SLI’s) and the modern monitoring landscape
- Error budgets and the associated error budget policies
- Toil and its effect on an organization’s productivity
- Some practical steps that can help to eliminate toil
- Observability as something to indicate the health of a service
- SRE tools, automation techniques and the importance of security
- Anti-fragility, our approach to failure and failure testing
- The organizational impact that introducing SRE brings
Target Audience
The Site Reliability Engineering (SRE) Foundation course is designed for professionals aiming to enhance their understanding of SRE principles and practices. This course is ideal for:
-
Site Reliability Engineers: Individuals responsible for maintaining system reliability and performance.
-
DevOps Practitioners: Professionals integrating development and operations to improve collaboration and efficiency.
-
System Administrators: Experts managing and configuring computer systems and networks.
-
IT Operations Staff: Teams overseeing the day-to-day management of IT infrastructure.
-
Software Engineers: Developers focused on building and maintaining software applications.
-
IT Managers: Leaders overseeing IT teams and aligning IT strategies with business objectives.
-
Anyone Interested in SRE: Individuals seeking to understand and implement SRE practices within their organizations.
This course provides foundational knowledge for those looking to adopt or improve SRE practices, ensuring systems are scalable, reliable, and efficient.
Exam Structure
The Site Reliability Engineering (SRE) Foundation certification exam is structured to assess your understanding of SRE principles and practices. The exam details are as follows:
- Format: Multiple-choice questions
- Number of Questions: 40
- Duration: 60 minutes
- Passing Score: 65%
- Open Book: Yes
- Certification Validity: 3 years
Downloads and Resources
If your preferred date is not available, please feel free to get in touch with us.



Vengates Rao Subramaniam –
I like the most about the thought process of how to incorporate SRE with DevOps and Agile practice.
Junnes Pineda –
John is a great instructor. He is trying to give real life examples based on his personal professional experiences. Encourage interactive engagements among the participants like mural boards, breakout sessions to further discuss on how the concepts can be applied to our work/ team/ org.
Santosh Gadhave –
It was practical and both trainers (Jan & Lavanya) had a very good understanding of concepts. If I could change one thing about this course, it would be adding more practical examples from different case studies.
Santosh Gadhave –
Trainer Lavanya is very well read and her experience talks a lot.
Jason Ho –
Duration is nicely spaced. Maybe provide some practical use cases from other large organisations implementing SRE.