Summarized experience on DevOps after several years in the field. The "Aggressive DevOps" concept as part of the Xi Group Ltd. company culture. Thoughts on data gathering, monitoring, alerting. Technology stacks and culture behind the DevOps movement.
2. What is “DevOps”?!
• Is it a technology / tooling?!
• Is it a cultural thing?!
• Is it a business thing?!
• Should I even care …
3. DevOps
• Development + Operations
• It is technological in nature!
• It is a cultural thing!
• “The Business” needs it!
4. A few myths about the DevOps
• Developers can do Ops
• System Administrators are obsolete
• You need it only for the Cloud
• It is supplementary activity
5. Why should I care?!
• Because today everything is distributed …
• … and distributed systems are hard!
• Because IT complexity is constantly growing!
• Because it allows you to scale the human
factor!
9. The new normal
• Distributed systems are complex and fragile!
• Distributed systems come with control planes!
• Service discovery is required!
• “errāre hūmānum est” (Seneca)
10. … and from the ashes DevOps will
rise, Fierce and Mighty …
11.
12. … to help us …
• … change system architectures …
• … bring order to chaos …
• … build and deploy the product …
• … monitor everything …
• … analyze the log files …
• … educate Developers in all things Ops …
• … build data-driven control planes …
• … and much, much more …
13. New problems require new tools
• Configuration management
– Puppet, Chef, Ansible, Salt
– Vagrant
– Fabric, Gearman
• Infrastructure-as-a-Code tools
– AWS CLI / python-boto, REST API
– Joyent SmartDataCenters / node.js sdc
– Rackspace, CloudFlare / REST API
14. New problems require new tools
• Build and deployment automation
– Jenkins / Hudson
– Travis CI
– BuildBot
• Service Discovery
– DNS-SD
– Etsy ETCD, Heroku Doozer, Apache Zookeeper
– Consul & Serf
15. New problems require new tools
• Full-stack application monitoring
– Graphite, Ganglia
– New Relic
– StackDriver, Signal Fuse, Boundary, AWS
CloudWatch, …
• Alerting systems
– Nagios (really?!)
– Sensu
– PagerDuty, AWS SNS, …
17. Focus: Continuous Delivery
• Build on every commit / merge!
• Deploy after every build!
• Verify / Smoke-test the deployment!
• If possible, route some real traffic to it!
18. Focus: Monitoring
• Nagios is obsolete but relevant …
• … it can still handle quite some load …
22. Focus: Alerting
• Alert when human intervention is required!
• Automate otherwise!
• Have a direct link between alert and
operational procedure!
• Account for “pager fatigue”!
24. Focus: Smart Control Plane
• Monitoring data is consumed by the Control
Plane.
• Workloads drive the elastic behavior of the
distributed system.
• Business-specific logic is used to guide
operational decision making process.
25. Focus: Smart Control Plane
• Control Plane is Data-Driven.
• Control Plane is Pro-active.
• Control Plane is “aware” of business goals.
• Control Plane must be highly-available.
28. New problems require new culture
• Operational input in all phases of the Software
Development Life Cycle!
• Operational instrumentation is part of the
core product!
• Enterprise silos are NO MORE!
29. What about the “Aggressive” ?!
Well … try implementing all of the
above in a typical company … ;)