SlideShare une entreprise Scribd logo
1  sur  36
Télécharger pour lire hors ligne
Shift-Left SRE - Self-healing on
OpenShift with Ansible
#RedHatOSD
Jürgen Etzlstorfer, Technology Strategist
@jetzlstorfer
A relentless pursuit of software perfection
1600 Employees 5000 Enterprise Customers 79 of the Global 100
confidential
If you write applications,
they will break eventually
~ Murphy‘s law
On average, a single transaction uses 82 different types of technology
Browser
Multi-geo
Mobile Network
Code
Hosts
Logs
IoT
3rd parties
Services
Cloud SDN
Containers
Applications are getting more complex!
confidential
What if you had
something similar to
a self-healing robot?
confidential
But how?
Building auto-remediation into your pipeline
#RedHatOSD
✓ ✓ ✓ ✓
commit build test stage prod
✓
Building auto-remediation into your pipeline
#RedHatOSD
✓ ✓ ✓ ✓ ✗
commit build test stage prod
Building auto-remediation into your pipeline
#RedHatOSD
✓ ✓ ✗
commit build test stage prod
Break the
pipeline early
Building auto-remediation into your pipeline
#RedHatOSD
✓ ✓
commit build test stage prod
Include
remediation-as-code
…
Building auto-remediation into your pipeline
#RedHatOSD
✓ ✓
commit build test stage prod
Include
remediation-as-code
✓
… … … … …
✓ ✓
OpenShift Container Platform
Dynatrace Software Intelligence
Ansible Automation
Self-healing applications
Automation
via APIs
▪ Automation/Execution: perform
mitigation/remediation actions
▪ Access to all systems
▪ Monitoring: know what’s going on in your
applications
▪ End-to-end
▪ Full-stack – fully integrated in production
Self-healing building blocks
#RedHatOSD
FullStack-PoweredbyDynatraceOneAgent
Self-healing with Ansible Tower and Dynatrace
#RedHatOSD
● APIs are key to enable automation
● Ansible Tower provides rich API for managing Ansible jobs
● Playbooks can be orchestrated in workflows and job templates
How to enable auto-remediation
Full-stack
environment
is monitored
Anomalies
are detected
automatically
Root
cause
analysis is
performed
Problem
notification
is sent
Event is
received
Job is
triggered
Playbook is
executed
Problem is
remediated
#RedHatOSD
How to enable auto-remediation
Full-stack
environment
is monitored
Anomalies
are detected
automatically
Root
cause
analysis is
performed
Problem
notification
is sent
Event is
received
Job is
triggered
Playbook is
executed
Problem is
remediated
#RedHatOSD
Ansible Tower integration in Dynatrace
#RedHatOSD
Ansible Tower integration in Dynatrace
#RedHatOSD
Ansible Tower integration in Dynatrace
#RedHatOSD
confidential
What we will see in the demo
▪ TicketMonster application running on OpenShift
▪ Full-stack, end-to-end monitoring by Dynatrace
▪ Feature release via Ansible Tower
▪ Auto-remediation as code (Ansible playbooks)
#RedHatOSD
DEMO TIME :)
confidential
What we have seen in the demo – short recap
Release of a
new feature
Dynatrace
detects
increase of
failure rate
Dynatrace
fires a
problem
notification
to Ansible
Tower
Ansible
Tower kicks
off a
playbook
Check for
latest
deployment
with
remediation
scripts
Remediation
script is
executed
Problem is
remediated
#RedHatOSD
Self-healing in an enterprise environment
Auto Mitigate!
1 CPU Exhausted? Add a new service instance!
Escalate at
2AM?
5
1
2
3
4
Self-healing in an enterprise environment
Auto Mitigate!
1 CPU Exhausted? Add a new service instance!
Escalate at
2AM?
2 High Garbage Collection? Adjust/Revert Memory Settings!
5
1
2
3
4
Self-healing in an enterprise environment
Auto Mitigate!
1 CPU Exhausted? Add a new service instance!
3 Issue with BLUE only? Switch back to GREEN!
Escalate at
2AM?
2 High Garbage Collection? Adjust/Revert Memory Settings!
5
1
2
3
4
Self-healing in an enterprise environment
Auto Mitigate!
1 CPU Exhausted? Add a new service instance!
3 Issue with BLUE only? Switch back to GREEN!
Escalate at
2AM?
2 High Garbage Collection? Adjust/Revert Memory Settings!
4 Hung threads? Restart Service!
5
1
2
3
4
Self-healing in an enterprise environment
Auto Mitigate!
1 CPU Exhausted? Add a new service instance!
3 Issue with BLUE only? Switch back to GREEN!
Escalate at
2AM?
2 High Garbage Collection? Adjust/Revert Memory Settings!
4 Hung threads? Restart Service!
5
1
2
3
4
Update Dev
TicketsImpact Mitigated??
Self-healing in an enterprise environment
Auto Mitigate!
1 CPU Exhausted? Add a new service instance!
3 Issue with BLUE only? Switch back to GREEN!
Escalate at
2AM?
2 High Garbage Collection? Adjust/Revert Memory Settings!
4 Hung threads? Restart Service!
5 Still ongoing? Initiate Rollback!
5
1
2
3
4
Mark Bad
Commits
Update Dev
TicketsImpact Mitigated??
Self-healing in an enterprise environment
Auto Mitigate!
1 CPU Exhausted? Add a new service instance!
3 Issue with BLUE only? Switch back to GREEN!
Escalate at
2AM?
2 High Garbage Collection? Adjust/Revert Memory Settings!
4 Hung threads? Restart Service!
5 Still ongoing? Initiate Rollback!
Escalate? Still ongoing?
5
1
2
3
4
Mark Bad
Commits
Update Dev
TicketsImpact Mitigated??
confidential
Embed auto-remediation in your CI/CD pipeline
confidential
Embed auto-remediation in your CI/CD pipeline
Path to NoOps: Self-Healing, …
Shift-Right: Tags, Deploys, Events
confidential
Embed auto-remediation in your CI/CD pipeline
Shift-Left: Break Pipeline Earlier
Path to NoOps: Self-Healing, …
Shift-Right: Tags, Deploys, Events
Actionable Feedback Loops
https://www.ansible.com/blog/enable-self-healing-applications-with-ansible-and-dynatrace
https://www.dynatrace.com/news/blog/set-up-ansible-tower-with-dynatrace-to-enable-your-self-healing-applications/
OpenShift Dynatrace♥
“Beyond years of industry knowledge in the APM space, Dynatrace offers one of the best solutions I’ve seen
for monitoring applications running on OpenShift. What really distinguishes them from others is the use of
artificial intelligence based root-cause analysis. OpenShift is a platform to allow you to run decoupled services
and applications, which can be a monitoring nightmare, but Dynatrace’s insights makes it less scary.”
Chris Morgan, Technical Director – Red Hat OpenShift Ecosystem
GRAZIE PER L’ATTENZIONE
Jürgen Etzlstorfer, Technology Strategist
#RedHatOSD

Contenu connexe

Tendances

Data Loss and Duplication in Kafka
Data Loss and Duplication in KafkaData Loss and Duplication in Kafka
Data Loss and Duplication in KafkaJayesh Thakrar
 
Architecture patterns for distributed, hybrid, edge and global Apache Kafka d...
Architecture patterns for distributed, hybrid, edge and global Apache Kafka d...Architecture patterns for distributed, hybrid, edge and global Apache Kafka d...
Architecture patterns for distributed, hybrid, edge and global Apache Kafka d...Kai Wähner
 
Building an SRE Organization @ Squarespace
Building an SRE Organization @ SquarespaceBuilding an SRE Organization @ Squarespace
Building an SRE Organization @ SquarespaceFranklin Angulo
 
SRE Demystified - 14 - SRE Practices overview
SRE Demystified - 14 - SRE Practices overviewSRE Demystified - 14 - SRE Practices overview
SRE Demystified - 14 - SRE Practices overviewDr Ganesh Iyer
 
6 Nines: How Stripe keeps Kafka highly-available across the globe with Donny ...
6 Nines: How Stripe keeps Kafka highly-available across the globe with Donny ...6 Nines: How Stripe keeps Kafka highly-available across the globe with Donny ...
6 Nines: How Stripe keeps Kafka highly-available across the globe with Donny ...HostedbyConfluent
 
Developing applications with a microservice architecture (SVforum, microservi...
Developing applications with a microservice architecture (SVforum, microservi...Developing applications with a microservice architecture (SVforum, microservi...
Developing applications with a microservice architecture (SVforum, microservi...Chris Richardson
 
Room 2 - 3 - Nguyễn Hoài Nam & Nguyễn Việt Hùng - Terraform & Pulumi Comparin...
Room 2 - 3 - Nguyễn Hoài Nam & Nguyễn Việt Hùng - Terraform & Pulumi Comparin...Room 2 - 3 - Nguyễn Hoài Nam & Nguyễn Việt Hùng - Terraform & Pulumi Comparin...
Room 2 - 3 - Nguyễn Hoài Nam & Nguyễn Việt Hùng - Terraform & Pulumi Comparin...Vietnam Open Infrastructure User Group
 
Monitor every app, in every stage, with free and open Elastic APM
Monitor every app, in every stage, with free and open Elastic APMMonitor every app, in every stage, with free and open Elastic APM
Monitor every app, in every stage, with free and open Elastic APMElasticsearch
 
Big Data Redis Mongodb Dynamodb Sharding
Big Data Redis Mongodb Dynamodb ShardingBig Data Redis Mongodb Dynamodb Sharding
Big Data Redis Mongodb Dynamodb ShardingAraf Karsh Hamid
 
Kubernetes Concepts And Architecture Powerpoint Presentation Slides
Kubernetes Concepts And Architecture Powerpoint Presentation SlidesKubernetes Concepts And Architecture Powerpoint Presentation Slides
Kubernetes Concepts And Architecture Powerpoint Presentation SlidesSlideTeam
 
Red Hat OpenShift Operators - Operators ABC
Red Hat OpenShift Operators - Operators ABCRed Hat OpenShift Operators - Operators ABC
Red Hat OpenShift Operators - Operators ABCRobert Bohne
 
Storage Requirements and Options for Running Spark on Kubernetes
Storage Requirements and Options for Running Spark on KubernetesStorage Requirements and Options for Running Spark on Kubernetes
Storage Requirements and Options for Running Spark on KubernetesDataWorks Summit
 
promgen - prometheus managemnet tool / simpleclient_java hacks @ Prometheus c...
promgen - prometheus managemnet tool / simpleclient_java hacks @ Prometheus c...promgen - prometheus managemnet tool / simpleclient_java hacks @ Prometheus c...
promgen - prometheus managemnet tool / simpleclient_java hacks @ Prometheus c...Tokuhiro Matsuno
 
서비스 모니터링 구현 사례 공유 - Realtime log monitoring platform-PMon을 ...
서비스 모니터링 구현 사례 공유 - Realtime log monitoring platform-PMon을 ...서비스 모니터링 구현 사례 공유 - Realtime log monitoring platform-PMon을 ...
서비스 모니터링 구현 사례 공유 - Realtime log monitoring platform-PMon을 ...Jemin Huh
 
Cloud Native Engineering with SRE and GitOps
Cloud Native Engineering with SRE and GitOpsCloud Native Engineering with SRE and GitOps
Cloud Native Engineering with SRE and GitOpsWeaveworks
 
SRE 101 (Site Reliability Engineering)
SRE 101 (Site Reliability Engineering)SRE 101 (Site Reliability Engineering)
SRE 101 (Site Reliability Engineering)Hussain Mansoor
 
Next-Generation Cloud Native Apps with Spring Cloud and Kubernetes
Next-Generation Cloud Native Apps with Spring Cloud and KubernetesNext-Generation Cloud Native Apps with Spring Cloud and Kubernetes
Next-Generation Cloud Native Apps with Spring Cloud and KubernetesVMware Tanzu
 

Tendances (20)

Data Loss and Duplication in Kafka
Data Loss and Duplication in KafkaData Loss and Duplication in Kafka
Data Loss and Duplication in Kafka
 
Introduction to CI/CD
Introduction to CI/CDIntroduction to CI/CD
Introduction to CI/CD
 
Architecture patterns for distributed, hybrid, edge and global Apache Kafka d...
Architecture patterns for distributed, hybrid, edge and global Apache Kafka d...Architecture patterns for distributed, hybrid, edge and global Apache Kafka d...
Architecture patterns for distributed, hybrid, edge and global Apache Kafka d...
 
Building an SRE Organization @ Squarespace
Building an SRE Organization @ SquarespaceBuilding an SRE Organization @ Squarespace
Building an SRE Organization @ Squarespace
 
SRE Demystified - 14 - SRE Practices overview
SRE Demystified - 14 - SRE Practices overviewSRE Demystified - 14 - SRE Practices overview
SRE Demystified - 14 - SRE Practices overview
 
Cloud Native Application Development
Cloud Native Application DevelopmentCloud Native Application Development
Cloud Native Application Development
 
6 Nines: How Stripe keeps Kafka highly-available across the globe with Donny ...
6 Nines: How Stripe keeps Kafka highly-available across the globe with Donny ...6 Nines: How Stripe keeps Kafka highly-available across the globe with Donny ...
6 Nines: How Stripe keeps Kafka highly-available across the globe with Donny ...
 
Developing applications with a microservice architecture (SVforum, microservi...
Developing applications with a microservice architecture (SVforum, microservi...Developing applications with a microservice architecture (SVforum, microservi...
Developing applications with a microservice architecture (SVforum, microservi...
 
Room 2 - 3 - Nguyễn Hoài Nam & Nguyễn Việt Hùng - Terraform & Pulumi Comparin...
Room 2 - 3 - Nguyễn Hoài Nam & Nguyễn Việt Hùng - Terraform & Pulumi Comparin...Room 2 - 3 - Nguyễn Hoài Nam & Nguyễn Việt Hùng - Terraform & Pulumi Comparin...
Room 2 - 3 - Nguyễn Hoài Nam & Nguyễn Việt Hùng - Terraform & Pulumi Comparin...
 
Monitor every app, in every stage, with free and open Elastic APM
Monitor every app, in every stage, with free and open Elastic APMMonitor every app, in every stage, with free and open Elastic APM
Monitor every app, in every stage, with free and open Elastic APM
 
Big Data Redis Mongodb Dynamodb Sharding
Big Data Redis Mongodb Dynamodb ShardingBig Data Redis Mongodb Dynamodb Sharding
Big Data Redis Mongodb Dynamodb Sharding
 
Kubernetes Concepts And Architecture Powerpoint Presentation Slides
Kubernetes Concepts And Architecture Powerpoint Presentation SlidesKubernetes Concepts And Architecture Powerpoint Presentation Slides
Kubernetes Concepts And Architecture Powerpoint Presentation Slides
 
Red Hat OpenShift Operators - Operators ABC
Red Hat OpenShift Operators - Operators ABCRed Hat OpenShift Operators - Operators ABC
Red Hat OpenShift Operators - Operators ABC
 
Continuous Delivery Maturity Model
Continuous Delivery Maturity ModelContinuous Delivery Maturity Model
Continuous Delivery Maturity Model
 
Storage Requirements and Options for Running Spark on Kubernetes
Storage Requirements and Options for Running Spark on KubernetesStorage Requirements and Options for Running Spark on Kubernetes
Storage Requirements and Options for Running Spark on Kubernetes
 
promgen - prometheus managemnet tool / simpleclient_java hacks @ Prometheus c...
promgen - prometheus managemnet tool / simpleclient_java hacks @ Prometheus c...promgen - prometheus managemnet tool / simpleclient_java hacks @ Prometheus c...
promgen - prometheus managemnet tool / simpleclient_java hacks @ Prometheus c...
 
서비스 모니터링 구현 사례 공유 - Realtime log monitoring platform-PMon을 ...
서비스 모니터링 구현 사례 공유 - Realtime log monitoring platform-PMon을 ...서비스 모니터링 구현 사례 공유 - Realtime log monitoring platform-PMon을 ...
서비스 모니터링 구현 사례 공유 - Realtime log monitoring platform-PMon을 ...
 
Cloud Native Engineering with SRE and GitOps
Cloud Native Engineering with SRE and GitOpsCloud Native Engineering with SRE and GitOps
Cloud Native Engineering with SRE and GitOps
 
SRE 101 (Site Reliability Engineering)
SRE 101 (Site Reliability Engineering)SRE 101 (Site Reliability Engineering)
SRE 101 (Site Reliability Engineering)
 
Next-Generation Cloud Native Apps with Spring Cloud and Kubernetes
Next-Generation Cloud Native Apps with Spring Cloud and KubernetesNext-Generation Cloud Native Apps with Spring Cloud and Kubernetes
Next-Generation Cloud Native Apps with Spring Cloud and Kubernetes
 

Similaire à Shift-left SRE: Self-healing on OpenShift with Ansible

Dev ops with smell v1.2
Dev ops with smell v1.2Dev ops with smell v1.2
Dev ops with smell v1.2Antons Kranga
 
It's Time to Debloat the Cloud with Unikraft
It's Time to Debloat the Cloud with UnikraftIt's Time to Debloat the Cloud with Unikraft
It's Time to Debloat the Cloud with UnikraftScyllaDB
 
[RHFSeoul2017]6 Steps to Transform Enterprise Applications
[RHFSeoul2017]6 Steps to Transform Enterprise Applications[RHFSeoul2017]6 Steps to Transform Enterprise Applications
[RHFSeoul2017]6 Steps to Transform Enterprise ApplicationsDaniel Oh
 
Success Factors for a Mature Microservices Implementation
Success Factors for a Mature Microservices ImplementationSuccess Factors for a Mature Microservices Implementation
Success Factors for a Mature Microservices ImplementationDustin Ruehle
 
Stress Testing at Twitter: a tale of New Year Eves
Stress Testing at Twitter: a tale of New Year EvesStress Testing at Twitter: a tale of New Year Eves
Stress Testing at Twitter: a tale of New Year EvesHerval Freire
 
Serverless in production, an experience report (FullStack 2018)
Serverless in production, an experience report (FullStack 2018)Serverless in production, an experience report (FullStack 2018)
Serverless in production, an experience report (FullStack 2018)Yan Cui
 
Serverless in Production, an experience report (AWS UG South Wales)
Serverless in Production, an experience report (AWS UG South Wales)Serverless in Production, an experience report (AWS UG South Wales)
Serverless in Production, an experience report (AWS UG South Wales)Yan Cui
 
Serverless in production, an experience report (CoDe-Conf)
Serverless in production, an experience report (CoDe-Conf)Serverless in production, an experience report (CoDe-Conf)
Serverless in production, an experience report (CoDe-Conf)Yan Cui
 
Serverless in production, an experience report
Serverless in production, an experience reportServerless in production, an experience report
Serverless in production, an experience reportYan Cui
 
Experts live dtap reinvented, a risk driven approach to release pipelines
Experts live dtap reinvented, a risk driven approach to release pipelinesExperts live dtap reinvented, a risk driven approach to release pipelines
Experts live dtap reinvented, a risk driven approach to release pipelinesRolf Huisman
 
Keeping Your DevOps Transformation From Crushing Your Ops Capacity
Keeping Your DevOps Transformation From Crushing Your Ops Capacity Keeping Your DevOps Transformation From Crushing Your Ops Capacity
Keeping Your DevOps Transformation From Crushing Your Ops Capacity Rundeck
 
Need to-know patterns building microservices - java one
Need to-know patterns building microservices - java oneNeed to-know patterns building microservices - java one
Need to-know patterns building microservices - java oneVincent Kok
 
Dev Tested, Ops Approved: 10 Guardrails from Atlassian for Better, Faster Dev...
Dev Tested, Ops Approved: 10 Guardrails from Atlassian for Better, Faster Dev...Dev Tested, Ops Approved: 10 Guardrails from Atlassian for Better, Faster Dev...
Dev Tested, Ops Approved: 10 Guardrails from Atlassian for Better, Faster Dev...Amazon Web Services
 
Schneider Electric Scada Global Support Provides Troubleshooting and Technica...
Schneider Electric Scada Global Support Provides Troubleshooting and Technica...Schneider Electric Scada Global Support Provides Troubleshooting and Technica...
Schneider Electric Scada Global Support Provides Troubleshooting and Technica...Preeya Selvarajah
 
Things You MUST Know Before Deploying OpenStack: Bruno Lago, Catalyst IT
Things You MUST Know Before Deploying OpenStack: Bruno Lago, Catalyst ITThings You MUST Know Before Deploying OpenStack: Bruno Lago, Catalyst IT
Things You MUST Know Before Deploying OpenStack: Bruno Lago, Catalyst ITOpenStack
 
Tech trends 2018 2019
Tech trends 2018 2019Tech trends 2018 2019
Tech trends 2018 2019Johan Norm
 
Developer-Friendly CI / CD for Kubernetes
Developer-Friendly CI / CD for KubernetesDeveloper-Friendly CI / CD for Kubernetes
Developer-Friendly CI / CD for KubernetesDevOps Indonesia
 
44CON 2014 - Switches Get Stitches, Eireann Leverett & Matt Erasmus
44CON 2014 - Switches Get Stitches,  Eireann Leverett & Matt Erasmus44CON 2014 - Switches Get Stitches,  Eireann Leverett & Matt Erasmus
44CON 2014 - Switches Get Stitches, Eireann Leverett & Matt Erasmus44CON
 
Continues Deployment - Tech Talk week
Continues Deployment - Tech Talk weekContinues Deployment - Tech Talk week
Continues Deployment - Tech Talk weekrantav
 

Similaire à Shift-left SRE: Self-healing on OpenShift with Ansible (20)

Dev ops with smell v1.2
Dev ops with smell v1.2Dev ops with smell v1.2
Dev ops with smell v1.2
 
It's Time to Debloat the Cloud with Unikraft
It's Time to Debloat the Cloud with UnikraftIt's Time to Debloat the Cloud with Unikraft
It's Time to Debloat the Cloud with Unikraft
 
[RHFSeoul2017]6 Steps to Transform Enterprise Applications
[RHFSeoul2017]6 Steps to Transform Enterprise Applications[RHFSeoul2017]6 Steps to Transform Enterprise Applications
[RHFSeoul2017]6 Steps to Transform Enterprise Applications
 
Success Factors for a Mature Microservices Implementation
Success Factors for a Mature Microservices ImplementationSuccess Factors for a Mature Microservices Implementation
Success Factors for a Mature Microservices Implementation
 
Stress Testing at Twitter: a tale of New Year Eves
Stress Testing at Twitter: a tale of New Year EvesStress Testing at Twitter: a tale of New Year Eves
Stress Testing at Twitter: a tale of New Year Eves
 
Serverless in production, an experience report (FullStack 2018)
Serverless in production, an experience report (FullStack 2018)Serverless in production, an experience report (FullStack 2018)
Serverless in production, an experience report (FullStack 2018)
 
Serverless in Production, an experience report (AWS UG South Wales)
Serverless in Production, an experience report (AWS UG South Wales)Serverless in Production, an experience report (AWS UG South Wales)
Serverless in Production, an experience report (AWS UG South Wales)
 
Serverless in production, an experience report (CoDe-Conf)
Serverless in production, an experience report (CoDe-Conf)Serverless in production, an experience report (CoDe-Conf)
Serverless in production, an experience report (CoDe-Conf)
 
Serverless in production, an experience report
Serverless in production, an experience reportServerless in production, an experience report
Serverless in production, an experience report
 
Experts live dtap reinvented, a risk driven approach to release pipelines
Experts live dtap reinvented, a risk driven approach to release pipelinesExperts live dtap reinvented, a risk driven approach to release pipelines
Experts live dtap reinvented, a risk driven approach to release pipelines
 
Keeping Your DevOps Transformation From Crushing Your Ops Capacity
Keeping Your DevOps Transformation From Crushing Your Ops Capacity Keeping Your DevOps Transformation From Crushing Your Ops Capacity
Keeping Your DevOps Transformation From Crushing Your Ops Capacity
 
Need to-know patterns building microservices - java one
Need to-know patterns building microservices - java oneNeed to-know patterns building microservices - java one
Need to-know patterns building microservices - java one
 
Dev Tested, Ops Approved: 10 Guardrails from Atlassian for Better, Faster Dev...
Dev Tested, Ops Approved: 10 Guardrails from Atlassian for Better, Faster Dev...Dev Tested, Ops Approved: 10 Guardrails from Atlassian for Better, Faster Dev...
Dev Tested, Ops Approved: 10 Guardrails from Atlassian for Better, Faster Dev...
 
Schneider Electric Scada Global Support Provides Troubleshooting and Technica...
Schneider Electric Scada Global Support Provides Troubleshooting and Technica...Schneider Electric Scada Global Support Provides Troubleshooting and Technica...
Schneider Electric Scada Global Support Provides Troubleshooting and Technica...
 
Industrial IoT bootcamp
Industrial IoT bootcampIndustrial IoT bootcamp
Industrial IoT bootcamp
 
Things You MUST Know Before Deploying OpenStack: Bruno Lago, Catalyst IT
Things You MUST Know Before Deploying OpenStack: Bruno Lago, Catalyst ITThings You MUST Know Before Deploying OpenStack: Bruno Lago, Catalyst IT
Things You MUST Know Before Deploying OpenStack: Bruno Lago, Catalyst IT
 
Tech trends 2018 2019
Tech trends 2018 2019Tech trends 2018 2019
Tech trends 2018 2019
 
Developer-Friendly CI / CD for Kubernetes
Developer-Friendly CI / CD for KubernetesDeveloper-Friendly CI / CD for Kubernetes
Developer-Friendly CI / CD for Kubernetes
 
44CON 2014 - Switches Get Stitches, Eireann Leverett & Matt Erasmus
44CON 2014 - Switches Get Stitches,  Eireann Leverett & Matt Erasmus44CON 2014 - Switches Get Stitches,  Eireann Leverett & Matt Erasmus
44CON 2014 - Switches Get Stitches, Eireann Leverett & Matt Erasmus
 
Continues Deployment - Tech Talk week
Continues Deployment - Tech Talk weekContinues Deployment - Tech Talk week
Continues Deployment - Tech Talk week
 

Dernier

Large Language Models for Test Case Evolution and Repair
Large Language Models for Test Case Evolution and RepairLarge Language Models for Test Case Evolution and Repair
Large Language Models for Test Case Evolution and RepairLionel Briand
 
Catch the Wave: SAP Event-Driven and Data Streaming for the Intelligence Ente...
Catch the Wave: SAP Event-Driven and Data Streaming for the Intelligence Ente...Catch the Wave: SAP Event-Driven and Data Streaming for the Intelligence Ente...
Catch the Wave: SAP Event-Driven and Data Streaming for the Intelligence Ente...confluent
 
Cyber security and its impact on E commerce
Cyber security and its impact on E commerceCyber security and its impact on E commerce
Cyber security and its impact on E commercemanigoyal112
 
Lecture # 8 software design and architecture (SDA).ppt
Lecture # 8 software design and architecture (SDA).pptLecture # 8 software design and architecture (SDA).ppt
Lecture # 8 software design and architecture (SDA).pptesrabilgic2
 
SensoDat: Simulation-based Sensor Dataset of Self-driving Cars
SensoDat: Simulation-based Sensor Dataset of Self-driving CarsSensoDat: Simulation-based Sensor Dataset of Self-driving Cars
SensoDat: Simulation-based Sensor Dataset of Self-driving CarsChristian Birchler
 
SuccessFactors 1H 2024 Release - Sneak-Peek by Deloitte Germany
SuccessFactors 1H 2024 Release - Sneak-Peek by Deloitte GermanySuccessFactors 1H 2024 Release - Sneak-Peek by Deloitte Germany
SuccessFactors 1H 2024 Release - Sneak-Peek by Deloitte GermanyChristoph Pohl
 
Alfresco TTL#157 - Troubleshooting Made Easy: Deciphering Alfresco mTLS Confi...
Alfresco TTL#157 - Troubleshooting Made Easy: Deciphering Alfresco mTLS Confi...Alfresco TTL#157 - Troubleshooting Made Easy: Deciphering Alfresco mTLS Confi...
Alfresco TTL#157 - Troubleshooting Made Easy: Deciphering Alfresco mTLS Confi...Angel Borroy López
 
How to submit a standout Adobe Champion Application
How to submit a standout Adobe Champion ApplicationHow to submit a standout Adobe Champion Application
How to submit a standout Adobe Champion ApplicationBradBedford3
 
Salesforce Implementation Services PPT By ABSYZ
Salesforce Implementation Services PPT By ABSYZSalesforce Implementation Services PPT By ABSYZ
Salesforce Implementation Services PPT By ABSYZABSYZ Inc
 
SpotFlow: Tracking Method Calls and States at Runtime
SpotFlow: Tracking Method Calls and States at RuntimeSpotFlow: Tracking Method Calls and States at Runtime
SpotFlow: Tracking Method Calls and States at Runtimeandrehoraa
 
Balasore Best It Company|| Top 10 IT Company || Balasore Software company Odisha
Balasore Best It Company|| Top 10 IT Company || Balasore Software company OdishaBalasore Best It Company|| Top 10 IT Company || Balasore Software company Odisha
Balasore Best It Company|| Top 10 IT Company || Balasore Software company Odishasmiwainfosol
 
Post Quantum Cryptography – The Impact on Identity
Post Quantum Cryptography – The Impact on IdentityPost Quantum Cryptography – The Impact on Identity
Post Quantum Cryptography – The Impact on Identityteam-WIBU
 
Sending Calendar Invites on SES and Calendarsnack.pdf
Sending Calendar Invites on SES and Calendarsnack.pdfSending Calendar Invites on SES and Calendarsnack.pdf
Sending Calendar Invites on SES and Calendarsnack.pdf31events.com
 
VK Business Profile - provides IT solutions and Web Development
VK Business Profile - provides IT solutions and Web DevelopmentVK Business Profile - provides IT solutions and Web Development
VK Business Profile - provides IT solutions and Web Developmentvyaparkranti
 
英国UN学位证,北安普顿大学毕业证书1:1制作
英国UN学位证,北安普顿大学毕业证书1:1制作英国UN学位证,北安普顿大学毕业证书1:1制作
英国UN学位证,北安普顿大学毕业证书1:1制作qr0udbr0
 
Odoo 14 - eLearning Module In Odoo 14 Enterprise
Odoo 14 - eLearning Module In Odoo 14 EnterpriseOdoo 14 - eLearning Module In Odoo 14 Enterprise
Odoo 14 - eLearning Module In Odoo 14 Enterprisepreethippts
 
GOING AOT WITH GRAALVM – DEVOXX GREECE.pdf
GOING AOT WITH GRAALVM – DEVOXX GREECE.pdfGOING AOT WITH GRAALVM – DEVOXX GREECE.pdf
GOING AOT WITH GRAALVM – DEVOXX GREECE.pdfAlina Yurenko
 
Innovate and Collaborate- Harnessing the Power of Open Source Software.pdf
Innovate and Collaborate- Harnessing the Power of Open Source Software.pdfInnovate and Collaborate- Harnessing the Power of Open Source Software.pdf
Innovate and Collaborate- Harnessing the Power of Open Source Software.pdfYashikaSharma391629
 
Unveiling the Future: Sylius 2.0 New Features
Unveiling the Future: Sylius 2.0 New FeaturesUnveiling the Future: Sylius 2.0 New Features
Unveiling the Future: Sylius 2.0 New FeaturesŁukasz Chruściel
 
A healthy diet for your Java application Devoxx France.pdf
A healthy diet for your Java application Devoxx France.pdfA healthy diet for your Java application Devoxx France.pdf
A healthy diet for your Java application Devoxx France.pdfMarharyta Nedzelska
 

Dernier (20)

Large Language Models for Test Case Evolution and Repair
Large Language Models for Test Case Evolution and RepairLarge Language Models for Test Case Evolution and Repair
Large Language Models for Test Case Evolution and Repair
 
Catch the Wave: SAP Event-Driven and Data Streaming for the Intelligence Ente...
Catch the Wave: SAP Event-Driven and Data Streaming for the Intelligence Ente...Catch the Wave: SAP Event-Driven and Data Streaming for the Intelligence Ente...
Catch the Wave: SAP Event-Driven and Data Streaming for the Intelligence Ente...
 
Cyber security and its impact on E commerce
Cyber security and its impact on E commerceCyber security and its impact on E commerce
Cyber security and its impact on E commerce
 
Lecture # 8 software design and architecture (SDA).ppt
Lecture # 8 software design and architecture (SDA).pptLecture # 8 software design and architecture (SDA).ppt
Lecture # 8 software design and architecture (SDA).ppt
 
SensoDat: Simulation-based Sensor Dataset of Self-driving Cars
SensoDat: Simulation-based Sensor Dataset of Self-driving CarsSensoDat: Simulation-based Sensor Dataset of Self-driving Cars
SensoDat: Simulation-based Sensor Dataset of Self-driving Cars
 
SuccessFactors 1H 2024 Release - Sneak-Peek by Deloitte Germany
SuccessFactors 1H 2024 Release - Sneak-Peek by Deloitte GermanySuccessFactors 1H 2024 Release - Sneak-Peek by Deloitte Germany
SuccessFactors 1H 2024 Release - Sneak-Peek by Deloitte Germany
 
Alfresco TTL#157 - Troubleshooting Made Easy: Deciphering Alfresco mTLS Confi...
Alfresco TTL#157 - Troubleshooting Made Easy: Deciphering Alfresco mTLS Confi...Alfresco TTL#157 - Troubleshooting Made Easy: Deciphering Alfresco mTLS Confi...
Alfresco TTL#157 - Troubleshooting Made Easy: Deciphering Alfresco mTLS Confi...
 
How to submit a standout Adobe Champion Application
How to submit a standout Adobe Champion ApplicationHow to submit a standout Adobe Champion Application
How to submit a standout Adobe Champion Application
 
Salesforce Implementation Services PPT By ABSYZ
Salesforce Implementation Services PPT By ABSYZSalesforce Implementation Services PPT By ABSYZ
Salesforce Implementation Services PPT By ABSYZ
 
SpotFlow: Tracking Method Calls and States at Runtime
SpotFlow: Tracking Method Calls and States at RuntimeSpotFlow: Tracking Method Calls and States at Runtime
SpotFlow: Tracking Method Calls and States at Runtime
 
Balasore Best It Company|| Top 10 IT Company || Balasore Software company Odisha
Balasore Best It Company|| Top 10 IT Company || Balasore Software company OdishaBalasore Best It Company|| Top 10 IT Company || Balasore Software company Odisha
Balasore Best It Company|| Top 10 IT Company || Balasore Software company Odisha
 
Post Quantum Cryptography – The Impact on Identity
Post Quantum Cryptography – The Impact on IdentityPost Quantum Cryptography – The Impact on Identity
Post Quantum Cryptography – The Impact on Identity
 
Sending Calendar Invites on SES and Calendarsnack.pdf
Sending Calendar Invites on SES and Calendarsnack.pdfSending Calendar Invites on SES and Calendarsnack.pdf
Sending Calendar Invites on SES and Calendarsnack.pdf
 
VK Business Profile - provides IT solutions and Web Development
VK Business Profile - provides IT solutions and Web DevelopmentVK Business Profile - provides IT solutions and Web Development
VK Business Profile - provides IT solutions and Web Development
 
英国UN学位证,北安普顿大学毕业证书1:1制作
英国UN学位证,北安普顿大学毕业证书1:1制作英国UN学位证,北安普顿大学毕业证书1:1制作
英国UN学位证,北安普顿大学毕业证书1:1制作
 
Odoo 14 - eLearning Module In Odoo 14 Enterprise
Odoo 14 - eLearning Module In Odoo 14 EnterpriseOdoo 14 - eLearning Module In Odoo 14 Enterprise
Odoo 14 - eLearning Module In Odoo 14 Enterprise
 
GOING AOT WITH GRAALVM – DEVOXX GREECE.pdf
GOING AOT WITH GRAALVM – DEVOXX GREECE.pdfGOING AOT WITH GRAALVM – DEVOXX GREECE.pdf
GOING AOT WITH GRAALVM – DEVOXX GREECE.pdf
 
Innovate and Collaborate- Harnessing the Power of Open Source Software.pdf
Innovate and Collaborate- Harnessing the Power of Open Source Software.pdfInnovate and Collaborate- Harnessing the Power of Open Source Software.pdf
Innovate and Collaborate- Harnessing the Power of Open Source Software.pdf
 
Unveiling the Future: Sylius 2.0 New Features
Unveiling the Future: Sylius 2.0 New FeaturesUnveiling the Future: Sylius 2.0 New Features
Unveiling the Future: Sylius 2.0 New Features
 
A healthy diet for your Java application Devoxx France.pdf
A healthy diet for your Java application Devoxx France.pdfA healthy diet for your Java application Devoxx France.pdf
A healthy diet for your Java application Devoxx France.pdf
 

Shift-left SRE: Self-healing on OpenShift with Ansible

  • 1. Shift-Left SRE - Self-healing on OpenShift with Ansible #RedHatOSD Jürgen Etzlstorfer, Technology Strategist @jetzlstorfer
  • 2. A relentless pursuit of software perfection 1600 Employees 5000 Enterprise Customers 79 of the Global 100
  • 3. confidential If you write applications, they will break eventually ~ Murphy‘s law
  • 4. On average, a single transaction uses 82 different types of technology Browser Multi-geo Mobile Network Code Hosts Logs IoT 3rd parties Services Cloud SDN Containers Applications are getting more complex!
  • 5. confidential What if you had something similar to a self-healing robot?
  • 7. Building auto-remediation into your pipeline #RedHatOSD ✓ ✓ ✓ ✓ commit build test stage prod ✓
  • 8. Building auto-remediation into your pipeline #RedHatOSD ✓ ✓ ✓ ✓ ✗ commit build test stage prod
  • 9. Building auto-remediation into your pipeline #RedHatOSD ✓ ✓ ✗ commit build test stage prod Break the pipeline early
  • 10. Building auto-remediation into your pipeline #RedHatOSD ✓ ✓ commit build test stage prod Include remediation-as-code …
  • 11. Building auto-remediation into your pipeline #RedHatOSD ✓ ✓ commit build test stage prod Include remediation-as-code ✓ … … … … … ✓ ✓
  • 12. OpenShift Container Platform Dynatrace Software Intelligence Ansible Automation Self-healing applications
  • 13. Automation via APIs ▪ Automation/Execution: perform mitigation/remediation actions ▪ Access to all systems ▪ Monitoring: know what’s going on in your applications ▪ End-to-end ▪ Full-stack – fully integrated in production Self-healing building blocks #RedHatOSD
  • 15. Self-healing with Ansible Tower and Dynatrace #RedHatOSD ● APIs are key to enable automation ● Ansible Tower provides rich API for managing Ansible jobs ● Playbooks can be orchestrated in workflows and job templates
  • 16. How to enable auto-remediation Full-stack environment is monitored Anomalies are detected automatically Root cause analysis is performed Problem notification is sent Event is received Job is triggered Playbook is executed Problem is remediated #RedHatOSD
  • 17. How to enable auto-remediation Full-stack environment is monitored Anomalies are detected automatically Root cause analysis is performed Problem notification is sent Event is received Job is triggered Playbook is executed Problem is remediated #RedHatOSD
  • 18. Ansible Tower integration in Dynatrace #RedHatOSD
  • 19. Ansible Tower integration in Dynatrace #RedHatOSD
  • 20. Ansible Tower integration in Dynatrace #RedHatOSD
  • 21. confidential What we will see in the demo ▪ TicketMonster application running on OpenShift ▪ Full-stack, end-to-end monitoring by Dynatrace ▪ Feature release via Ansible Tower ▪ Auto-remediation as code (Ansible playbooks) #RedHatOSD
  • 23. confidential What we have seen in the demo – short recap Release of a new feature Dynatrace detects increase of failure rate Dynatrace fires a problem notification to Ansible Tower Ansible Tower kicks off a playbook Check for latest deployment with remediation scripts Remediation script is executed Problem is remediated #RedHatOSD
  • 24. Self-healing in an enterprise environment Auto Mitigate! 1 CPU Exhausted? Add a new service instance! Escalate at 2AM? 5 1 2 3 4
  • 25. Self-healing in an enterprise environment Auto Mitigate! 1 CPU Exhausted? Add a new service instance! Escalate at 2AM? 2 High Garbage Collection? Adjust/Revert Memory Settings! 5 1 2 3 4
  • 26. Self-healing in an enterprise environment Auto Mitigate! 1 CPU Exhausted? Add a new service instance! 3 Issue with BLUE only? Switch back to GREEN! Escalate at 2AM? 2 High Garbage Collection? Adjust/Revert Memory Settings! 5 1 2 3 4
  • 27. Self-healing in an enterprise environment Auto Mitigate! 1 CPU Exhausted? Add a new service instance! 3 Issue with BLUE only? Switch back to GREEN! Escalate at 2AM? 2 High Garbage Collection? Adjust/Revert Memory Settings! 4 Hung threads? Restart Service! 5 1 2 3 4
  • 28. Self-healing in an enterprise environment Auto Mitigate! 1 CPU Exhausted? Add a new service instance! 3 Issue with BLUE only? Switch back to GREEN! Escalate at 2AM? 2 High Garbage Collection? Adjust/Revert Memory Settings! 4 Hung threads? Restart Service! 5 1 2 3 4 Update Dev TicketsImpact Mitigated??
  • 29. Self-healing in an enterprise environment Auto Mitigate! 1 CPU Exhausted? Add a new service instance! 3 Issue with BLUE only? Switch back to GREEN! Escalate at 2AM? 2 High Garbage Collection? Adjust/Revert Memory Settings! 4 Hung threads? Restart Service! 5 Still ongoing? Initiate Rollback! 5 1 2 3 4 Mark Bad Commits Update Dev TicketsImpact Mitigated??
  • 30. Self-healing in an enterprise environment Auto Mitigate! 1 CPU Exhausted? Add a new service instance! 3 Issue with BLUE only? Switch back to GREEN! Escalate at 2AM? 2 High Garbage Collection? Adjust/Revert Memory Settings! 4 Hung threads? Restart Service! 5 Still ongoing? Initiate Rollback! Escalate? Still ongoing? 5 1 2 3 4 Mark Bad Commits Update Dev TicketsImpact Mitigated??
  • 32. confidential Embed auto-remediation in your CI/CD pipeline Path to NoOps: Self-Healing, … Shift-Right: Tags, Deploys, Events
  • 33. confidential Embed auto-remediation in your CI/CD pipeline Shift-Left: Break Pipeline Earlier Path to NoOps: Self-Healing, … Shift-Right: Tags, Deploys, Events Actionable Feedback Loops
  • 35. OpenShift Dynatrace♥ “Beyond years of industry knowledge in the APM space, Dynatrace offers one of the best solutions I’ve seen for monitoring applications running on OpenShift. What really distinguishes them from others is the use of artificial intelligence based root-cause analysis. OpenShift is a platform to allow you to run decoupled services and applications, which can be a monitoring nightmare, but Dynatrace’s insights makes it less scary.” Chris Morgan, Technical Director – Red Hat OpenShift Ecosystem
  • 36. GRAZIE PER L’ATTENZIONE Jürgen Etzlstorfer, Technology Strategist #RedHatOSD