Site Reliability Engineering (SRE) Practitioner (DevOps Institute)

4.6 out of 5 rating Last updated 14/11/2024 English

Click "Enquire" below to find out more about this course

Interested in available dates? Would like to book a private session of this course for your company? Or for any other queries please simply fill out the form below.

Duration

3 Days

18 CPD hours

Overview

After completing this course, students will have learned:
Practical view of how to successfully implement a flourishing SRE culture in your organization.
The underlying principles of SRE and an understanding of what it is not in terms of anti-patterns, and how you become aware of them to avoid them.
The organizational impact of introducing SRE.
Acing the art of SLIs and SLOs in a distributed ecosystem and extending the usage of Error Budgets beyond the normal to innovate and avoid risks.
Building security and resilience by design in a distributed, zero-trust environment.
How do you implement full stack observability, distributed tracing and bring about an Observability-driven development culture
Curating data using AI to move from reactive to proactive and predictive incident management. Also, how you use DataOps to build clean data lineage.
Why is Platform Engineering so important in building consistency and predictability of SRE culture
Implementing practical Chaos Engineering.
Major incident response responsibilities for a SRE based on incident command framework, and examples of anatomy of unmanaged incidents.
Perspective of why SRE can be considered as the purest implementation of DevOps
SRE Execution model
Understanding the SRE role and understanding why reliability is everyone’s problem.
SRE success story learnings

Description

This course introduces a range of practices for advancing service reliability engineering through a mixture of automation, organizational ways of working and business alignment. Tailored for those focused on large-scale service scalability and reliability.

SRE Anti-patterns

Rebranding Ops or DevOps or Dev as SRE
Users notice an issue before you do
Measuring until my Edge
False positives are worse than no alerts
Configuration management trap for snowflakes
The Dogpile: Mob incident response
Point fixing
Production Readiness Gatekeeper
Fail-Safe really

SLO is a Proxy for Customer Happiness

Define SLIs that meaningfully measure the reliability of a service from a users perspective
Defining System boundaries in a distributed ecosystem for defining correct SLIs
Use error budgets to help your team have better discussions and make better data-driven decisions
Overall, Reliability is only as good as the weakest link on your service graph
Error thresholds when 3rd party services are used

Building Secure and Reliable Systems

SRE and their role in Building Secure and Reliable systems
Design for Changing Architecture
Fault tolerant Design
Design for Security
Design for Resiliency
Design for Scalability
Design for Performance
Design for Reliability
Ensuring Data Security and Privacy

Full-Stack Observability

Modern Apps are Complex & Unpredictable
Slow is the new down
Pillars of Observability
Implementing Synthetic and End user monitoring
Observability driven development
Distributed Tracing
What happens to Monitoring
Instrumenting using Libraries an Agents

Platform Engineering and AIOPs

Taking a Platform Centric View solves Organizational scalability challenges such as fragmentation, inconsistency and unpredictability.
How do you use AIOps to improve Resiliency
How can DataOps help you in the journey
A simple recipe to implement AIOps
Indicative measurement of AIOps

SRE & Incident Response Management

SRE Key Responsibilities towards incident response
DevOps & SRE and ITIL
OODA and SRE Incident Response
Closed Loop Remediation and the Advantages
Swarming Food for Thought
AI/ML for better incident management

Chaos Engineering

Navigating Complexity
Chaos Engineering Defined
Quick Facts about Chaos Engineering
Chaos Monkey Origin Story
Who is adopting Chaos Engineering
Myths of Chaos
Chaos Engineering Experiments
GameDay Exercises
Security Chaos Engineering
Chaos Engineering Resources

SRE is the Purest form of DevOps

Key Principles of SRE
SREs help increase Reliability across the product spectrum
Metrics for Success
Selection of Target areas
SRE Execution Model
Culture and Behavioral Skills are key
SRE Case study

Post-class assignments/exercises

Non-abstract Large Scale Design (after Day 1)
Engineering Instrumentation- Instrumenting Gremlin (after Day 2)

Additional course details:

Nexus Humans Site Reliability Engineering (SRE) Practitioner (DevOps Institute) training program is a workshop that presents an invigorating mix of sessions, lessons, and masterclasses meticulously crafted to propel your learning expedition forward.

This immersive bootcamp-style experience boasts interactive lectures, hands-on labs, and collaborative hackathons, all strategically designed to fortify fundamental concepts.

Guided by seasoned coaches, each session offers priceless insights and practical skills crucial for honing your expertise. Whether you're stepping into the realm of professional skills or a seasoned professional, this comprehensive course ensures you're equipped with the knowledge and prowess necessary for success.

While we feel this is the best course for the Site Reliability Engineering (SRE) Practitioner (DevOps Institute) course and one of our Top 10 we encourage you to read the course outline to make sure it is the right content for you.

Additionally, private sessions, closed classes or dedicated events are available both live online and at our training centres in Dublin and London, as well as at your offices anywhere in the UK, Ireland or across EMEA.

FAQ for the Site Reliability Engineering (SRE) Practitioner (DevOps Institute) Course

Available Delivery Options for the Site Reliability Engineering (SRE) Practitioner (DevOps Institute) training.

Live Instructor Led Classroom Online (Live Online)
Traditional Instructor Led Classroom (TILT/ILT)
Delivery at your offices in London or anywhere in the UK
Private dedicated course as works for your staff.

How many CPD hours does the Site Reliability Engineering (SRE) Practitioner (DevOps Institute) training provide?

The 3 day. Site Reliability Engineering (SRE) Practitioner (DevOps Institute) training course give you up to 18 CPD hours/structured learning hours. If you need a letter or certificate in a particular format for your association, organisation or professional body please just ask.

What is the correct audience for the Site Reliability Engineering (SRE) Practitioner (DevOps Institute) training?

The target audience for the SRE Practitioner course are professionals including:
Anyone focused on large-scale service scalability and reliability
Anyone interested in modern IT leadership and organizational change approaches
Business Managers
Business Stakeholders
Change Agents
Consultants
DevOps Practitioners
IT Directors
IT Managers
IT Team Leaders
Product Owners
Scrum Masters
Software Engineers
Site Reliability Engineers
System Integrators
Tool Providers

Do you provide training for the Site Reliability Engineering (SRE) Practitioner (DevOps Institute).

Yes we provide corporate training, dedicated training and closed classes for the Site Reliability Engineering (SRE) Practitioner (DevOps Institute). This can take place anywhere in UK including, England, Scotland, Cymru (Wales) or Northern Ireland or live online allowing you to have your teams from across UK or further afield to attend a single training event saving travel and delivery expenses.

What is the duration of the Site Reliability Engineering (SRE) Practitioner (DevOps Institute) program.

The Site Reliability Engineering (SRE) Practitioner (DevOps Institute) training takes place over 3 day(s), with each day lasting approximately 8 hours including small and lunch breaks to ensure that the delegates get the most out of the day.

Why are Nexus Human the best provider for the Site Reliability Engineering (SRE) Practitioner (DevOps Institute)?

Nexus Human are recognised as one of the best training companies as they and their trainers have won and hold many awards and titles including having previously won the Small Firms Best Trainer award, national training partner of the year for UK on multiple occasions, having trainers in the global top 30 instructor awards in 2012, 2019 and 2021. Nexus Human has also been nominated for the Tech Excellence awards multiple times. Learning Performance institute (LPI) external training provider sponsor 2024.

Is there a discount code for the Site Reliability Engineering (SRE) Practitioner (DevOps Institute) training.

Yes, the discount code PENPAL5 is currently available for the Site Reliability Engineering (SRE) Practitioner (DevOps Institute) training. Other discount codes may also be available but only one discount code or special offer can be used for each booking. This discount code is available for companies and individuals.

Jump to dates

Training Insurance Included!

When you organise training, we understand that there is a risk that some people may fall ill, become unavailable. To mitigate the risk we include training insurance for each delegate enrolled on our public schedule, they are welcome to sit on the same Public class within 6 months at no charge, if the case arises.

What people say about us

Booking process was very easy. Once initial contact was made everything happened smoothly.

Pfizer

Great course yesterday and can’t give enough amazing feedback for John. I’ve never had a teacher hold attention of a class from start to finish, but yesterday he did this.

Jennifer O'Connor

Vistra

Alan delivered the training in an easy to understand and pleasant manner which was broken down into manageable chunks with an appropriate number of breaks to give folk time away from their laptops and screens. It is never easy to do 2 days of training on a fairly in-depth subject involving Software even in a classroom environment and remaining focused and engaged so doing it remotely adds another level of complexity, but I have to say that personally this was one of the best Training Events that I have attended for some years. I have gained so much knowledge of MS Project and it will certainly make my job easier going forward.

A large part of what I gained has to be credited to Alan’s manner and delivery and I should be grateful if you could pass on my thanks to him.

Also, the support, together with joining instructions & training literature etc from Nexus Human has been excellent – thank you for that. I intend feeding that back to our People and ICT Teams too

Nexus Human - Project class

December 2023

This was my 2nd course with Nexus Human and I enjoy the format of the courses.
This was very well-run, the instructor was superb and i will get a lot of benefit from having access for the next 6 months, I look forward to the next course.

Cathal

Kerry, Ireland

I would like to thank the staff of Nexus Human, who were very helpful, prompt, friendly and professional in helping me attain my Microsoft certification. I would like to thank Abdul in particular for his guidance during the course and commend him on his knowledge of software development.

Paul

ITC Group
Dublin

This is the 3rd course I have done with Arun. He is very helpful and always supportive if I have other questions outside the lecture parameters. I have already recommended him and Nexus Human to work colleges and would do so again after this course.

Philip

Nexus Human Classroom Student

The instructor was excellent. Was willing to field any questions we had and deal with our specific queries.

Excel student

The class just finished and I just wanted to mention to you that it was absolutely amazing. The quality of the training had very high standards, the trainer explained amazingly well and we actually practiced (did) all the steps. It was not learning by watching a presentation but by actually doing. The training has been delivered from Atlanta (US) and I cannot wait to go tomorrow in the office and recommend it. The connection and the lab environment were running perfectly.
Once again thank you very much and looking forward for the next trainings scheduled. Enrolling via Nexus Human was the perfect choice.

Cristina, Nexus Human IT Course Attendee

Firstly I would like to say thanks, I found the course I sat yesterday extremely beneficial and the instructor was fantastic, she was enthusiastic and engaging and I would highly recommend the course to friends and colleagues. The post class services are also excellent!

Adobe Student

Sharon was lovely and has a fantastic knowledge. Honestly I didn't think I needed part 1 but I am SO glad I came. Thank you. Will be back in for part 2.

Nexus Human Student

Excellent interaction and answered all of my questions. Looking forward to being able to apply some of the tips and features learned today which hopefully should have real impact on some of the more time consuming tasks currently encountered.

Nexus Human Power BI Attendee

Brilliant
Wonderful teacher
Course presented extremely well indeed

Nexus Human Student feedback

Top

Popular Courses

Certifications

Skill Up Cards

Skill Up Card - Course Bundles