SlideShare une entreprise Scribd logo
1  sur  49
Télécharger pour lire hors ligne
SCALING ENGINEERING WITH DOCKER
A Case Study
https://flic.kr/p/ba2mjn
TOM LEACH
@tomtheguvnor
github.com/tleach
TRAVISTHIEMAN
@thieman
github.com/thieman
gc.com/about
WHAT WE’LL COVER
1. What motivated GameChanger to adopt Docker?
2. Walkthrough of GameChanger Deploy Pipeline
WHAT MOTIVATED GAMECHANGERTO
ADOPT DOCKER?
• Scorekeeping
• 150+ Stats
• Live Gamestream
• Team management
• 12TB (10 histories
of pro sports)
• 10 MongoDB shards
• 100-400 app servers
• 50K games/day
(10K concurrent)
• 3000 w/s, 30,000 r/s
GameChanger’s market is amateur sports. Whereas ESPN caters to handful of top professional teams in the country, GameChanger provides free tools to the millions of
amateur sports teams around the world.
ELIMINATING DEPLOY-TIME RISK
This graph shows the number of requests/second received by one of our services over the last week. The area under the graph is broken down by host. You can see that
we are scaling our hosts up and down in response to demand. 

At GC we have an extremely spike traffic profile so using autoscaling is important to control costs. Therefore it’s very important not only to deploy new application code
to existing servers but also to be able to very reliably build new servers with minimal risk.
Chef Server
App
Server
App
Server
App
Server
App
Server
App
Server
App
Server
30 30 30 30 30
30
To illustrate the risks associated with a traditional Configuration Management approach to building servers, let’s look at the typical Chef architecture. 

This is a CM server which hosts the current valid configuration data for the cluster (and by configuration data we also mean setup scripts etc)

The Developer is responsible for pushing new configuration to the CM server and then all app servers periodically pull and execute the latest scripts. 

Risks:

- CM server is a SPOF. Chef is painful to scale out. 

- CM server needs to be scaled to support max conceivable cluster size (or we have problems when we need it most)

- Thundering herd
github.com/miketheman/knife-role-spaghetti
This is a visualization of GC’s role/recipe dependencies in Chef before we moved to Docker. 

Risks:

- Spaghetti-like dependencies are impossible to reason about (what happens if I upgrade node.js?)

- Dependencies are indirect and not explicit

- Testing is expensive and time consuming. Devs are disincentivized from testing. 

- Coupling issues not discovered until deploy time -> can take down your cluster.

- Rollback can be painful
App
Server
PyPI
npm
Ubuntu apt-get
rubygems
.tar.gz files
apache
binaries
S3
GitHub
Deploy-time dependencies on multiple external repositories is a big risk.

- Build AMIs (complex, heavy, time consuming, does not allow us to iterate fast enough)

- Host you own mirror for services like PyPI. But then who owns the maintenance of that mirror?
HOW DOES DOCKER
ELIMINATETHESE RISKS?
• Assets are baked into an immutable image at build time
• No deploy-time dependencies on 3rd party repos
• Docker registry is simple and easy to scale
• Dependencies simple, explicit and direct
• Rollback is trivial
SCALING ENGINEERING
A less obvious problem with traditional CM approaches is how they inhibit the scaling of engineering. Let’s illustrate with an example…
Application
FeatureTeam
Like many companies, GC’s product started out as a small Python app developed by a couple of people. At this point we only have a few users so we can run on a
couple of servers and deployment is simple manual step.
Application
FeatureTeam FeatureTeam
As we build more features our application gets bigger and we hire more people to help build and maintain those features. We’re still doing some form of manual
deployment at this point, and though it’s starting to become a bottleneck we’re still prioritizing feature development.
Monolithic Application
FeatureTeam FeatureTeam FeatureTeam
We grow further and our application grows accumulating more and more responsibilities. The need to coordinate test + build + deploy necessitates an Ops team to own
this problem.
Monolithic Application
FeatureTeam FeatureTeam FeatureTeam
Deployment
(Test + Build + Deploy)
OpsTeam
Following a more “Dev Ops” mantra, these responsibilities form more of a continuum. Devs care about getting their code to prod, Ops care about what the code does,
both cooperate. 

Deploying a monolithic app in this way actually works pretty well. The tech stack is fairly static, and forms a shared context which minimizes the friction between dev and
ops teams. 

The problem for GC was that this monolithic architecture scaled poorly for us:

- Poor ownership boundaries

- Quality of shared components suffered

- Introducing new languages is difficult to sell

- Different features have different CAP requirements

- Operational problems derived from indirect coupling
μ
FeatureTeam FeatureTeam FeatureTeam
OpsTeam
μ
μ μ
μ μ
μ μ
μ μ
μ μ
Deployment
(Test + Build + Deploy)
Solution: Teams own collections of independently-scalable microservices with clear ownership boundaries. Teams ar

But this poses a problem for our previous deployment approach:

- Suddenly Ops need to know how to deploy an ever growing list of technologies

- Information friction between Dev and Ops is high as the context is dynamic

- Deployment using something like Chef becomes more and more complex

- As feature teams are added, Ops becomes a bottleneck, the relationship risks becoming adversarial
–Melvin Conway, 1968
“Any organization that designs a system … will
inevitably produce a design whose structure is a
copy of the organization's communication structure.”
Conway’s Law

A collection of teams that design a system will inevitably produce a design which evolves from the minimum amount of out-of-band communication required between
those teams.
μ
Feature + OpsTeam Feature + OpsTeam Feature + OpsTeam
μ
μ μ
μ μ
μ μ
μ μ
μ μ
Deployment Deployment Deployment
In the face of the need high-traffic high-complexity communication to get software deployed, teams will be motivated towards compartmentalizing the way they approach
deployment. This is much better as the contextual footprint for each mini-Ops team is manageable. 

But there is still a problem here. We risk duplicating effort across teams on “core” deployment activities.
CORE DEPLOYMENTTASKS
• Log rotation
• User account creation,
sudoers, SSH keys
• Continuous Integration
• Metrics
• DNS
• Monitoring & alerting
• ulimits
• Tool installation
• …
All of these are important. Doing them well requires that they be owned and continuously improved and maintained as a first class system asset. On feature teams they
will not be treated in this way, we’ll duplicate effort building several half-formed implementations of these things.
μ
FeatureTeam FeatureTeam FeatureTeam
μ
μ μ
μ μ
μ μ
μ μ
μ μ
Build Build Build
OpsTeam
“Core” Deployment Pipeline
We still needed an Ops Team to own the core parts of deployment, but needed a way to ensure the interface between the feature and Ops teams to have low information
friction and not require the Ops team to understand n different tech stacks. 

We could have tried to use Chef to do this by making each team own its own roles, but you end up running into problems around shared dependencies, global state and
indirect coupling. 

Docker provides a neat abstraction which allows these responsibilities to be separated clearly and scalably.
HOW DOES DOCKER ALLOW USTO
SCALE ENGINEERING?
1. Development team has complete control over what they deploy
2. Core deployment can still be owned by a dedicated team as a first
class concern
3. Small shared context needed for cross-team communication
1. allows us to scale teams out linearly without creating a centralized bottleneck

2. eliminates wasted duplicate effort and the effort of maintaining a substandard system

3. eliminates waste effort communicating complex requirements in an out of band way
GAMECHANGER DEPLOY PIPELINE
Test Build Deploy
We’re going to run through the test-build-deploy pipeline at GameChanger. We’re using a separate service for
each of those, so let’s introduce the cast of characters.
Drone
TheTest Runner
Docker-based testing
Fully isolated and
concurrent tests
Easily tests against
service containers,
e.g. Postgres
Jenkins
The Build Server
In charge of building
Docker containers,
app packages, etc.
Can do almost anything,
but isolation not a
strong suit
Bagel
The Deploy Service
Manages versioning of
deployable apps across
environments
One-click deploy, rollback
mmm, bagel
Test Build Deploy
Test Build Deploy
We’re going to go through what it takes to wire up an application to work with our pipeline…
Test Build Deploy
…and while we’re doing it, we’re going to highlight the ways in which Docker helps us achieve the goals that Tom
was talking about earlier.
Python + Postgres:A Simple Application
Let’s consider a simple Python application that works with a Postgres database. It has a bunch of unit tests,
including unit tests that require connecting to an actual Postgres instance to run.
Test
Removes some dependency
setup concerns from
application dev
Tests are fully isolated,
coupling is minimized
Fast and parallelizable,
multiple teams can work
on a single app without
slowing each other down
Drone
TheTest Runner
Drone, as we mentioned before, is what we use to run tests. What are the benefits we get from using Drone?
Drone uses a simpleYAML specification
Test
Test
Drone uses a simpleYAML specification
Application devs do
not need to know
how to install these
Use official images or
write a generalized
image once, share with
all your teams
Test
Server
Host OS
Each test run is
fully isolated using
containers
Test
Server
Host OS
Each test run is
fully isolated using
containers
Parallel testing
becomes trivial
Fast testing of PRs
reduces likelihood of
breaking the build
Test
Build
Receives Git hash and
static dependencies
from Drone
Builds Docker images,
pushes to our private
Docker registry
Jenkins
The Build Server
Drone
PyPi Jenkins
Registry
Image
Build
Tested application code
(specific Git hash)
Same library versions used
to test that code, e.g. pip
freeze, npm shrinkwrap
All system libraries,
drivers, etc.
Build
What exactly are we putting into our images?
Note that our image will *not* contain service dependencies like the Postgres we want to run against. We have a
few options for how to connect to a database at runtime.
Drone
PyPi Jenkins
Registry
Image
Build
So this is how we make the build work with Docker. Why is this actually better than the traditional model? Well…
Drone
PyPi Jenkins
Registry
Image
Build
X
…what if we put a bullet in PyPi and are no longer able to get our library dependencies?
Before we answer that, what happened in the old world? We’d deploy new code, our servers would all try to pull
from PyPi, fail, and freak out. Is our site down? Are we up but with incorrect or partial dependencies? It’s not
great. What happens with Docker?
Drone
PyPi Jenkins
Registry
Image
Build
X
With Docker, that risk is moved from deploy time to build time. Jenkins will try to pull from PyPi and fail. It won’t
be able to push a new image with your updated code. This is usually a good thing! You can have confidence that
all the images you *do* have will have all their dependencies and be fully working images.
Mostly a thin API on
top of our Docker registry
Also owns triggering
deploys across our
infrastructure
Deploy
Bagel
The Deploy Service
Bagel gives us a way to coordinate the images in our Docker registry with their corresponding Git tags, the
dependencies that were baked into the images, etc.
I think just go into a quick demo here, show some cool dependency diffs and PR messages or something.
Deploy
deploy.travisthieman.com
Deploys triggered
via gossip protocol
Coordinated with
distributed locks
What happens when we hit the Deploy button in Bagel?
All our machines run identical
OS-level images
Images and runtime config
specified viaYAML
Deploy mechanism on each
machine reconciles spec and
containers currently running
Similar to Docker Compose
All our boxes run off the same machine image (AMI). A YAML file specifying which of our apps should be
deployed to that box is all that distinguishes it. Our pretty-dumb deploy scripts (triggered by Bagel) handle
matching the running state of that box’s containers to what’s in this YAML file and the current deployed versions
according to Bagel.
Test Build Deploy
That’s our deploy pipeline. Using Docker, we’ve seen significant gains in simplicity and developer productivity
across our test, build, and deploy stages. Our feature teams can release new services with ease, and our Ops
team has been phased out of existence. Our engineers are now free to focus on problems that benefit our
customers and our business.
Before I go, just a few closing thoughts on Docker as someone who’s spent a bit of time with it…
Young Ecosystem
RollYour Own?
Build Process Intricacies
Docker on OS X
Development Environments
QUESTIONS?
https://flic.kr/p/cQ3kfu
@tomtheguvnor
github.com/tleach
@thieman
github.com/thieman

Contenu connexe

Tendances

A Beginner's Guide to Application Load Testing
A Beginner's Guide to Application Load TestingA Beginner's Guide to Application Load Testing
A Beginner's Guide to Application Load TestingBirgit Pauli-Haack
 
QA Fest 2019. Олексій Остапов. Тестування навантаження за 5 хв. Порівняння до...
QA Fest 2019. Олексій Остапов. Тестування навантаження за 5 хв. Порівняння до...QA Fest 2019. Олексій Остапов. Тестування навантаження за 5 хв. Порівняння до...
QA Fest 2019. Олексій Остапов. Тестування навантаження за 5 хв. Порівняння до...QAFest
 
Load-testing 101 for Startups with Artillery.io
Load-testing 101 for Startups with Artillery.ioLoad-testing 101 for Startups with Artillery.io
Load-testing 101 for Startups with Artillery.ioHassy Veldstra
 
Performance testing with 100,000 concurrent users in AWS
Performance testing with 100,000 concurrent users in AWSPerformance testing with 100,000 concurrent users in AWS
Performance testing with 100,000 concurrent users in AWSMatthias Matook
 
Load Testing Tools | Testbytes
Load Testing Tools | TestbytesLoad Testing Tools | Testbytes
Load Testing Tools | TestbytesTestbytes
 
DEF CON 23 - Rich Kelley - harness powershell weaponization made easy
DEF CON 23 - Rich Kelley - harness powershell weaponization made easyDEF CON 23 - Rich Kelley - harness powershell weaponization made easy
DEF CON 23 - Rich Kelley - harness powershell weaponization made easyFelipe Prado
 
Octopus Deploy @Erie Day of Code
Octopus Deploy @Erie Day of CodeOctopus Deploy @Erie Day of Code
Octopus Deploy @Erie Day of CodeCassey Lottman
 
How to Test PowerShell Code Using Pester
How to Test PowerShell Code Using PesterHow to Test PowerShell Code Using Pester
How to Test PowerShell Code Using PesterChris Wahl
 
Eclipse DemoCamp Bucharest 2014 - Continuous Integration Jenkins/Hudson
Eclipse DemoCamp Bucharest 2014 - Continuous Integration Jenkins/HudsonEclipse DemoCamp Bucharest 2014 - Continuous Integration Jenkins/Hudson
Eclipse DemoCamp Bucharest 2014 - Continuous Integration Jenkins/HudsonVladLica
 
Load testing a website through JMETER
Load testing a website through JMETERLoad testing a website through JMETER
Load testing a website through JMETERBugRaptors
 
PERFORMANCE TESTING USING LOAD RUNNER
PERFORMANCE  TESTING  USING  LOAD RUNNERPERFORMANCE  TESTING  USING  LOAD RUNNER
PERFORMANCE TESTING USING LOAD RUNNERAjithaG9
 
Test your Javascript! v1.1
Test your Javascript! v1.1Test your Javascript! v1.1
Test your Javascript! v1.1Eric Wendelin
 
Getting started with Octopus Deploy
Getting started with Octopus DeployGetting started with Octopus Deploy
Getting started with Octopus DeployKaroline Klever
 
Wap tpresentation (Load testing Tool )
Wap tpresentation (Load testing Tool )Wap tpresentation (Load testing Tool )
Wap tpresentation (Load testing Tool )jagdishdevabhaipatel
 
Jenkins review buddy
Jenkins review buddyJenkins review buddy
Jenkins review buddyAske Olsson
 

Tendances (20)

Performance testing locust
Performance testing   locustPerformance testing   locust
Performance testing locust
 
A Beginner's Guide to Application Load Testing
A Beginner's Guide to Application Load TestingA Beginner's Guide to Application Load Testing
A Beginner's Guide to Application Load Testing
 
QA Fest 2019. Олексій Остапов. Тестування навантаження за 5 хв. Порівняння до...
QA Fest 2019. Олексій Остапов. Тестування навантаження за 5 хв. Порівняння до...QA Fest 2019. Олексій Остапов. Тестування навантаження за 5 хв. Порівняння до...
QA Fest 2019. Олексій Остапов. Тестування навантаження за 5 хв. Порівняння до...
 
Load-testing 101 for Startups with Artillery.io
Load-testing 101 for Startups with Artillery.ioLoad-testing 101 for Startups with Artillery.io
Load-testing 101 for Startups with Artillery.io
 
Performance testing with 100,000 concurrent users in AWS
Performance testing with 100,000 concurrent users in AWSPerformance testing with 100,000 concurrent users in AWS
Performance testing with 100,000 concurrent users in AWS
 
Load Testing Tools | Testbytes
Load Testing Tools | TestbytesLoad Testing Tools | Testbytes
Load Testing Tools | Testbytes
 
Odoo profiler
Odoo profilerOdoo profiler
Odoo profiler
 
DEF CON 23 - Rich Kelley - harness powershell weaponization made easy
DEF CON 23 - Rich Kelley - harness powershell weaponization made easyDEF CON 23 - Rich Kelley - harness powershell weaponization made easy
DEF CON 23 - Rich Kelley - harness powershell weaponization made easy
 
Octopus Deploy @Erie Day of Code
Octopus Deploy @Erie Day of CodeOctopus Deploy @Erie Day of Code
Octopus Deploy @Erie Day of Code
 
Jenkins tutorial
Jenkins tutorialJenkins tutorial
Jenkins tutorial
 
How to Test PowerShell Code Using Pester
How to Test PowerShell Code Using PesterHow to Test PowerShell Code Using Pester
How to Test PowerShell Code Using Pester
 
Eclipse DemoCamp Bucharest 2014 - Continuous Integration Jenkins/Hudson
Eclipse DemoCamp Bucharest 2014 - Continuous Integration Jenkins/HudsonEclipse DemoCamp Bucharest 2014 - Continuous Integration Jenkins/Hudson
Eclipse DemoCamp Bucharest 2014 - Continuous Integration Jenkins/Hudson
 
Load testing a website through JMETER
Load testing a website through JMETERLoad testing a website through JMETER
Load testing a website through JMETER
 
PERFORMANCE TESTING USING LOAD RUNNER
PERFORMANCE  TESTING  USING  LOAD RUNNERPERFORMANCE  TESTING  USING  LOAD RUNNER
PERFORMANCE TESTING USING LOAD RUNNER
 
Test your Javascript! v1.1
Test your Javascript! v1.1Test your Javascript! v1.1
Test your Javascript! v1.1
 
Getting started with Octopus Deploy
Getting started with Octopus DeployGetting started with Octopus Deploy
Getting started with Octopus Deploy
 
Wap tpresentation (Load testing Tool )
Wap tpresentation (Load testing Tool )Wap tpresentation (Load testing Tool )
Wap tpresentation (Load testing Tool )
 
JMeter
JMeterJMeter
JMeter
 
Automate Thyself
Automate ThyselfAutomate Thyself
Automate Thyself
 
Jenkins review buddy
Jenkins review buddyJenkins review buddy
Jenkins review buddy
 

Similaire à Scaling Engineering with Docker

The DevOps paradigm - the evolution of IT professionals and opensource toolkit
The DevOps paradigm - the evolution of IT professionals and opensource toolkitThe DevOps paradigm - the evolution of IT professionals and opensource toolkit
The DevOps paradigm - the evolution of IT professionals and opensource toolkitMarco Ferrigno
 
The DevOps Paradigm
The DevOps ParadigmThe DevOps Paradigm
The DevOps ParadigmNaLUG
 
Automating Software Deployments with AWS CodeDeploy by Matthew Trescot, Manag...
Automating Software Deployments with AWS CodeDeploy by Matthew Trescot, Manag...Automating Software Deployments with AWS CodeDeploy by Matthew Trescot, Manag...
Automating Software Deployments with AWS CodeDeploy by Matthew Trescot, Manag...Amazon Web Services
 
Accelerate your Application Delivery with DevOps and Microservices
Accelerate your Application Delivery with DevOps and MicroservicesAccelerate your Application Delivery with DevOps and Microservices
Accelerate your Application Delivery with DevOps and MicroservicesAmazon Web Services
 
Moving to Microservices with the Help of Distributed Traces
Moving to Microservices with the Help of Distributed TracesMoving to Microservices with the Help of Distributed Traces
Moving to Microservices with the Help of Distributed TracesKP Kaiser
 
From Duke of DevOps to Queen of Chaos - Api days 2018
From Duke of DevOps to Queen of Chaos - Api days 2018From Duke of DevOps to Queen of Chaos - Api days 2018
From Duke of DevOps to Queen of Chaos - Api days 2018Christophe Rochefolle
 
Care and feeding notes
Care and feeding notesCare and feeding notes
Care and feeding notesPerrin Harkins
 
The macro of microservices
The macro of microservicesThe macro of microservices
The macro of microservicesSoftware Guru
 
Five Ways Automation Has Increased Application Deployment and Changed Culture
Five Ways Automation Has Increased Application Deployment and Changed CultureFive Ways Automation Has Increased Application Deployment and Changed Culture
Five Ways Automation Has Increased Application Deployment and Changed CultureXebiaLabs
 
Scaling capacity while saving cash
Scaling capacity while saving cashScaling capacity while saving cash
Scaling capacity while saving cashKim Moir
 
WinOps meetup April 2016 DevOps lessons from Microsoft \\Build\
WinOps meetup April 2016   DevOps lessons from Microsoft \\Build\WinOps meetup April 2016   DevOps lessons from Microsoft \\Build\
WinOps meetup April 2016 DevOps lessons from Microsoft \\Build\DevOpsGroup
 
Devops interview questions 1 www.bigclasses.com
Devops interview questions  1  www.bigclasses.comDevops interview questions  1  www.bigclasses.com
Devops interview questions 1 www.bigclasses.combigclasses.com
 
Building a full-stack app with Golang and Google Cloud Platform in one week
Building a full-stack app with Golang and Google Cloud Platform in one weekBuilding a full-stack app with Golang and Google Cloud Platform in one week
Building a full-stack app with Golang and Google Cloud Platform in one weekDr. Felix Raab
 
Scaling Up Lookout
Scaling Up LookoutScaling Up Lookout
Scaling Up LookoutLookout
 
DevOps - Introduction to data science
DevOps - Introduction to data scienceDevOps - Introduction to data science
DevOps - Introduction to data scienceFrank Kienle
 

Similaire à Scaling Engineering with Docker (20)

The DevOps paradigm - the evolution of IT professionals and opensource toolkit
The DevOps paradigm - the evolution of IT professionals and opensource toolkitThe DevOps paradigm - the evolution of IT professionals and opensource toolkit
The DevOps paradigm - the evolution of IT professionals and opensource toolkit
 
The DevOps Paradigm
The DevOps ParadigmThe DevOps Paradigm
The DevOps Paradigm
 
Automating Software Deployments with AWS CodeDeploy by Matthew Trescot, Manag...
Automating Software Deployments with AWS CodeDeploy by Matthew Trescot, Manag...Automating Software Deployments with AWS CodeDeploy by Matthew Trescot, Manag...
Automating Software Deployments with AWS CodeDeploy by Matthew Trescot, Manag...
 
Accelerate your Application Delivery with DevOps and Microservices
Accelerate your Application Delivery with DevOps and MicroservicesAccelerate your Application Delivery with DevOps and Microservices
Accelerate your Application Delivery with DevOps and Microservices
 
Moving to Microservices with the Help of Distributed Traces
Moving to Microservices with the Help of Distributed TracesMoving to Microservices with the Help of Distributed Traces
Moving to Microservices with the Help of Distributed Traces
 
Micro services
Micro servicesMicro services
Micro services
 
DevOps demystified
DevOps demystifiedDevOps demystified
DevOps demystified
 
From Duke of DevOps to Queen of Chaos - Api days 2018
From Duke of DevOps to Queen of Chaos - Api days 2018From Duke of DevOps to Queen of Chaos - Api days 2018
From Duke of DevOps to Queen of Chaos - Api days 2018
 
Care and feeding notes
Care and feeding notesCare and feeding notes
Care and feeding notes
 
The macro of microservices
The macro of microservicesThe macro of microservices
The macro of microservices
 
What DevOps Isn't
What DevOps Isn'tWhat DevOps Isn't
What DevOps Isn't
 
DevOps explained
DevOps explainedDevOps explained
DevOps explained
 
Five Ways Automation Has Increased Application Deployment and Changed Culture
Five Ways Automation Has Increased Application Deployment and Changed CultureFive Ways Automation Has Increased Application Deployment and Changed Culture
Five Ways Automation Has Increased Application Deployment and Changed Culture
 
Scaling capacity while saving cash
Scaling capacity while saving cashScaling capacity while saving cash
Scaling capacity while saving cash
 
WinOps meetup April 2016 DevOps lessons from Microsoft \\Build\
WinOps meetup April 2016   DevOps lessons from Microsoft \\Build\WinOps meetup April 2016   DevOps lessons from Microsoft \\Build\
WinOps meetup April 2016 DevOps lessons from Microsoft \\Build\
 
Devops interview questions 1 www.bigclasses.com
Devops interview questions  1  www.bigclasses.comDevops interview questions  1  www.bigclasses.com
Devops interview questions 1 www.bigclasses.com
 
Building a full-stack app with Golang and Google Cloud Platform in one week
Building a full-stack app with Golang and Google Cloud Platform in one weekBuilding a full-stack app with Golang and Google Cloud Platform in one week
Building a full-stack app with Golang and Google Cloud Platform in one week
 
Path to continuous delivery
Path to continuous deliveryPath to continuous delivery
Path to continuous delivery
 
Scaling Up Lookout
Scaling Up LookoutScaling Up Lookout
Scaling Up Lookout
 
DevOps - Introduction to data science
DevOps - Introduction to data scienceDevOps - Introduction to data science
DevOps - Introduction to data science
 

Dernier

Enhancing Supply Chain Visibility with Cargo Cloud Solutions.pdf
Enhancing Supply Chain Visibility with Cargo Cloud Solutions.pdfEnhancing Supply Chain Visibility with Cargo Cloud Solutions.pdf
Enhancing Supply Chain Visibility with Cargo Cloud Solutions.pdfRTS corp
 
Patterns for automating API delivery. API conference
Patterns for automating API delivery. API conferencePatterns for automating API delivery. API conference
Patterns for automating API delivery. API conferencessuser9e7c64
 
Introduction to Firebase Workshop Slides
Introduction to Firebase Workshop SlidesIntroduction to Firebase Workshop Slides
Introduction to Firebase Workshop Slidesvaideheekore1
 
Best Angular 17 Classroom & Online training - Naresh IT
Best Angular 17 Classroom & Online training - Naresh ITBest Angular 17 Classroom & Online training - Naresh IT
Best Angular 17 Classroom & Online training - Naresh ITmanoharjgpsolutions
 
Post Quantum Cryptography – The Impact on Identity
Post Quantum Cryptography – The Impact on IdentityPost Quantum Cryptography – The Impact on Identity
Post Quantum Cryptography – The Impact on Identityteam-WIBU
 
Simplifying Microservices & Apps - The art of effortless development - Meetup...
Simplifying Microservices & Apps - The art of effortless development - Meetup...Simplifying Microservices & Apps - The art of effortless development - Meetup...
Simplifying Microservices & Apps - The art of effortless development - Meetup...Rob Geurden
 
GraphSummit Madrid - Product Vision and Roadmap - Luis Salvador Neo4j
GraphSummit Madrid - Product Vision and Roadmap - Luis Salvador Neo4jGraphSummit Madrid - Product Vision and Roadmap - Luis Salvador Neo4j
GraphSummit Madrid - Product Vision and Roadmap - Luis Salvador Neo4jNeo4j
 
Leveraging AI for Mobile App Testing on Real Devices | Applitools + Kobiton
Leveraging AI for Mobile App Testing on Real Devices | Applitools + KobitonLeveraging AI for Mobile App Testing on Real Devices | Applitools + Kobiton
Leveraging AI for Mobile App Testing on Real Devices | Applitools + KobitonApplitools
 
2024 DevNexus Patterns for Resiliency: Shuffle shards
2024 DevNexus Patterns for Resiliency: Shuffle shards2024 DevNexus Patterns for Resiliency: Shuffle shards
2024 DevNexus Patterns for Resiliency: Shuffle shardsChristopher Curtin
 
OpenChain Education Work Group Monthly Meeting - 2024-04-10 - Full Recording
OpenChain Education Work Group Monthly Meeting - 2024-04-10 - Full RecordingOpenChain Education Work Group Monthly Meeting - 2024-04-10 - Full Recording
OpenChain Education Work Group Monthly Meeting - 2024-04-10 - Full RecordingShane Coughlan
 
eSoftTools IMAP Backup Software and migration tools
eSoftTools IMAP Backup Software and migration toolseSoftTools IMAP Backup Software and migration tools
eSoftTools IMAP Backup Software and migration toolsosttopstonverter
 
VictoriaMetrics Q1 Meet Up '24 - Community & News Update
VictoriaMetrics Q1 Meet Up '24 - Community & News UpdateVictoriaMetrics Q1 Meet Up '24 - Community & News Update
VictoriaMetrics Q1 Meet Up '24 - Community & News UpdateVictoriaMetrics
 
Strategies for using alternative queries to mitigate zero results
Strategies for using alternative queries to mitigate zero resultsStrategies for using alternative queries to mitigate zero results
Strategies for using alternative queries to mitigate zero resultsJean Silva
 
Understanding Flamingo - DeepMind's VLM Architecture
Understanding Flamingo - DeepMind's VLM ArchitectureUnderstanding Flamingo - DeepMind's VLM Architecture
Understanding Flamingo - DeepMind's VLM Architecturerahul_net
 
Real-time Tracking and Monitoring with Cargo Cloud Solutions.pptx
Real-time Tracking and Monitoring with Cargo Cloud Solutions.pptxReal-time Tracking and Monitoring with Cargo Cloud Solutions.pptx
Real-time Tracking and Monitoring with Cargo Cloud Solutions.pptxRTS corp
 
Comparing Linux OS Image Update Models - EOSS 2024.pdf
Comparing Linux OS Image Update Models - EOSS 2024.pdfComparing Linux OS Image Update Models - EOSS 2024.pdf
Comparing Linux OS Image Update Models - EOSS 2024.pdfDrew Moseley
 
SensoDat: Simulation-based Sensor Dataset of Self-driving Cars
SensoDat: Simulation-based Sensor Dataset of Self-driving CarsSensoDat: Simulation-based Sensor Dataset of Self-driving Cars
SensoDat: Simulation-based Sensor Dataset of Self-driving CarsChristian Birchler
 
Powering Real-Time Decisions with Continuous Data Streams
Powering Real-Time Decisions with Continuous Data StreamsPowering Real-Time Decisions with Continuous Data Streams
Powering Real-Time Decisions with Continuous Data StreamsSafe Software
 
SAM Training Session - How to use EXCEL ?
SAM Training Session - How to use EXCEL ?SAM Training Session - How to use EXCEL ?
SAM Training Session - How to use EXCEL ?Alexandre Beguel
 
OpenChain AI Study Group - Europe and Asia Recap - 2024-04-11 - Full Recording
OpenChain AI Study Group - Europe and Asia Recap - 2024-04-11 - Full RecordingOpenChain AI Study Group - Europe and Asia Recap - 2024-04-11 - Full Recording
OpenChain AI Study Group - Europe and Asia Recap - 2024-04-11 - Full RecordingShane Coughlan
 

Dernier (20)

Enhancing Supply Chain Visibility with Cargo Cloud Solutions.pdf
Enhancing Supply Chain Visibility with Cargo Cloud Solutions.pdfEnhancing Supply Chain Visibility with Cargo Cloud Solutions.pdf
Enhancing Supply Chain Visibility with Cargo Cloud Solutions.pdf
 
Patterns for automating API delivery. API conference
Patterns for automating API delivery. API conferencePatterns for automating API delivery. API conference
Patterns for automating API delivery. API conference
 
Introduction to Firebase Workshop Slides
Introduction to Firebase Workshop SlidesIntroduction to Firebase Workshop Slides
Introduction to Firebase Workshop Slides
 
Best Angular 17 Classroom & Online training - Naresh IT
Best Angular 17 Classroom & Online training - Naresh ITBest Angular 17 Classroom & Online training - Naresh IT
Best Angular 17 Classroom & Online training - Naresh IT
 
Post Quantum Cryptography – The Impact on Identity
Post Quantum Cryptography – The Impact on IdentityPost Quantum Cryptography – The Impact on Identity
Post Quantum Cryptography – The Impact on Identity
 
Simplifying Microservices & Apps - The art of effortless development - Meetup...
Simplifying Microservices & Apps - The art of effortless development - Meetup...Simplifying Microservices & Apps - The art of effortless development - Meetup...
Simplifying Microservices & Apps - The art of effortless development - Meetup...
 
GraphSummit Madrid - Product Vision and Roadmap - Luis Salvador Neo4j
GraphSummit Madrid - Product Vision and Roadmap - Luis Salvador Neo4jGraphSummit Madrid - Product Vision and Roadmap - Luis Salvador Neo4j
GraphSummit Madrid - Product Vision and Roadmap - Luis Salvador Neo4j
 
Leveraging AI for Mobile App Testing on Real Devices | Applitools + Kobiton
Leveraging AI for Mobile App Testing on Real Devices | Applitools + KobitonLeveraging AI for Mobile App Testing on Real Devices | Applitools + Kobiton
Leveraging AI for Mobile App Testing on Real Devices | Applitools + Kobiton
 
2024 DevNexus Patterns for Resiliency: Shuffle shards
2024 DevNexus Patterns for Resiliency: Shuffle shards2024 DevNexus Patterns for Resiliency: Shuffle shards
2024 DevNexus Patterns for Resiliency: Shuffle shards
 
OpenChain Education Work Group Monthly Meeting - 2024-04-10 - Full Recording
OpenChain Education Work Group Monthly Meeting - 2024-04-10 - Full RecordingOpenChain Education Work Group Monthly Meeting - 2024-04-10 - Full Recording
OpenChain Education Work Group Monthly Meeting - 2024-04-10 - Full Recording
 
eSoftTools IMAP Backup Software and migration tools
eSoftTools IMAP Backup Software and migration toolseSoftTools IMAP Backup Software and migration tools
eSoftTools IMAP Backup Software and migration tools
 
VictoriaMetrics Q1 Meet Up '24 - Community & News Update
VictoriaMetrics Q1 Meet Up '24 - Community & News UpdateVictoriaMetrics Q1 Meet Up '24 - Community & News Update
VictoriaMetrics Q1 Meet Up '24 - Community & News Update
 
Strategies for using alternative queries to mitigate zero results
Strategies for using alternative queries to mitigate zero resultsStrategies for using alternative queries to mitigate zero results
Strategies for using alternative queries to mitigate zero results
 
Understanding Flamingo - DeepMind's VLM Architecture
Understanding Flamingo - DeepMind's VLM ArchitectureUnderstanding Flamingo - DeepMind's VLM Architecture
Understanding Flamingo - DeepMind's VLM Architecture
 
Real-time Tracking and Monitoring with Cargo Cloud Solutions.pptx
Real-time Tracking and Monitoring with Cargo Cloud Solutions.pptxReal-time Tracking and Monitoring with Cargo Cloud Solutions.pptx
Real-time Tracking and Monitoring with Cargo Cloud Solutions.pptx
 
Comparing Linux OS Image Update Models - EOSS 2024.pdf
Comparing Linux OS Image Update Models - EOSS 2024.pdfComparing Linux OS Image Update Models - EOSS 2024.pdf
Comparing Linux OS Image Update Models - EOSS 2024.pdf
 
SensoDat: Simulation-based Sensor Dataset of Self-driving Cars
SensoDat: Simulation-based Sensor Dataset of Self-driving CarsSensoDat: Simulation-based Sensor Dataset of Self-driving Cars
SensoDat: Simulation-based Sensor Dataset of Self-driving Cars
 
Powering Real-Time Decisions with Continuous Data Streams
Powering Real-Time Decisions with Continuous Data StreamsPowering Real-Time Decisions with Continuous Data Streams
Powering Real-Time Decisions with Continuous Data Streams
 
SAM Training Session - How to use EXCEL ?
SAM Training Session - How to use EXCEL ?SAM Training Session - How to use EXCEL ?
SAM Training Session - How to use EXCEL ?
 
OpenChain AI Study Group - Europe and Asia Recap - 2024-04-11 - Full Recording
OpenChain AI Study Group - Europe and Asia Recap - 2024-04-11 - Full RecordingOpenChain AI Study Group - Europe and Asia Recap - 2024-04-11 - Full Recording
OpenChain AI Study Group - Europe and Asia Recap - 2024-04-11 - Full Recording
 

Scaling Engineering with Docker

  • 1. SCALING ENGINEERING WITH DOCKER A Case Study https://flic.kr/p/ba2mjn
  • 3. WHAT WE’LL COVER 1. What motivated GameChanger to adopt Docker? 2. Walkthrough of GameChanger Deploy Pipeline
  • 5. • Scorekeeping • 150+ Stats • Live Gamestream • Team management • 12TB (10 histories of pro sports) • 10 MongoDB shards • 100-400 app servers • 50K games/day (10K concurrent) • 3000 w/s, 30,000 r/s GameChanger’s market is amateur sports. Whereas ESPN caters to handful of top professional teams in the country, GameChanger provides free tools to the millions of amateur sports teams around the world.
  • 6. ELIMINATING DEPLOY-TIME RISK This graph shows the number of requests/second received by one of our services over the last week. The area under the graph is broken down by host. You can see that we are scaling our hosts up and down in response to demand. At GC we have an extremely spike traffic profile so using autoscaling is important to control costs. Therefore it’s very important not only to deploy new application code to existing servers but also to be able to very reliably build new servers with minimal risk.
  • 7. Chef Server App Server App Server App Server App Server App Server App Server 30 30 30 30 30 30 To illustrate the risks associated with a traditional Configuration Management approach to building servers, let’s look at the typical Chef architecture. This is a CM server which hosts the current valid configuration data for the cluster (and by configuration data we also mean setup scripts etc) The Developer is responsible for pushing new configuration to the CM server and then all app servers periodically pull and execute the latest scripts. Risks: - CM server is a SPOF. Chef is painful to scale out. - CM server needs to be scaled to support max conceivable cluster size (or we have problems when we need it most) - Thundering herd
  • 8. github.com/miketheman/knife-role-spaghetti This is a visualization of GC’s role/recipe dependencies in Chef before we moved to Docker. Risks: - Spaghetti-like dependencies are impossible to reason about (what happens if I upgrade node.js?) - Dependencies are indirect and not explicit - Testing is expensive and time consuming. Devs are disincentivized from testing. - Coupling issues not discovered until deploy time -> can take down your cluster. - Rollback can be painful
  • 9. App Server PyPI npm Ubuntu apt-get rubygems .tar.gz files apache binaries S3 GitHub Deploy-time dependencies on multiple external repositories is a big risk. - Build AMIs (complex, heavy, time consuming, does not allow us to iterate fast enough) - Host you own mirror for services like PyPI. But then who owns the maintenance of that mirror?
  • 10. HOW DOES DOCKER ELIMINATETHESE RISKS? • Assets are baked into an immutable image at build time • No deploy-time dependencies on 3rd party repos • Docker registry is simple and easy to scale • Dependencies simple, explicit and direct • Rollback is trivial
  • 11. SCALING ENGINEERING A less obvious problem with traditional CM approaches is how they inhibit the scaling of engineering. Let’s illustrate with an example…
  • 12. Application FeatureTeam Like many companies, GC’s product started out as a small Python app developed by a couple of people. At this point we only have a few users so we can run on a couple of servers and deployment is simple manual step.
  • 13. Application FeatureTeam FeatureTeam As we build more features our application gets bigger and we hire more people to help build and maintain those features. We’re still doing some form of manual deployment at this point, and though it’s starting to become a bottleneck we’re still prioritizing feature development.
  • 14. Monolithic Application FeatureTeam FeatureTeam FeatureTeam We grow further and our application grows accumulating more and more responsibilities. The need to coordinate test + build + deploy necessitates an Ops team to own this problem.
  • 15. Monolithic Application FeatureTeam FeatureTeam FeatureTeam Deployment (Test + Build + Deploy) OpsTeam Following a more “Dev Ops” mantra, these responsibilities form more of a continuum. Devs care about getting their code to prod, Ops care about what the code does, both cooperate. Deploying a monolithic app in this way actually works pretty well. The tech stack is fairly static, and forms a shared context which minimizes the friction between dev and ops teams. The problem for GC was that this monolithic architecture scaled poorly for us: - Poor ownership boundaries - Quality of shared components suffered - Introducing new languages is difficult to sell - Different features have different CAP requirements - Operational problems derived from indirect coupling
  • 16. μ FeatureTeam FeatureTeam FeatureTeam OpsTeam μ μ μ μ μ μ μ μ μ μ μ Deployment (Test + Build + Deploy) Solution: Teams own collections of independently-scalable microservices with clear ownership boundaries. Teams ar But this poses a problem for our previous deployment approach: - Suddenly Ops need to know how to deploy an ever growing list of technologies - Information friction between Dev and Ops is high as the context is dynamic - Deployment using something like Chef becomes more and more complex - As feature teams are added, Ops becomes a bottleneck, the relationship risks becoming adversarial
  • 17. –Melvin Conway, 1968 “Any organization that designs a system … will inevitably produce a design whose structure is a copy of the organization's communication structure.” Conway’s Law A collection of teams that design a system will inevitably produce a design which evolves from the minimum amount of out-of-band communication required between those teams.
  • 18. μ Feature + OpsTeam Feature + OpsTeam Feature + OpsTeam μ μ μ μ μ μ μ μ μ μ μ Deployment Deployment Deployment In the face of the need high-traffic high-complexity communication to get software deployed, teams will be motivated towards compartmentalizing the way they approach deployment. This is much better as the contextual footprint for each mini-Ops team is manageable. But there is still a problem here. We risk duplicating effort across teams on “core” deployment activities.
  • 19. CORE DEPLOYMENTTASKS • Log rotation • User account creation, sudoers, SSH keys • Continuous Integration • Metrics • DNS • Monitoring & alerting • ulimits • Tool installation • … All of these are important. Doing them well requires that they be owned and continuously improved and maintained as a first class system asset. On feature teams they will not be treated in this way, we’ll duplicate effort building several half-formed implementations of these things.
  • 20. μ FeatureTeam FeatureTeam FeatureTeam μ μ μ μ μ μ μ μ μ μ μ Build Build Build OpsTeam “Core” Deployment Pipeline We still needed an Ops Team to own the core parts of deployment, but needed a way to ensure the interface between the feature and Ops teams to have low information friction and not require the Ops team to understand n different tech stacks. We could have tried to use Chef to do this by making each team own its own roles, but you end up running into problems around shared dependencies, global state and indirect coupling. Docker provides a neat abstraction which allows these responsibilities to be separated clearly and scalably.
  • 21. HOW DOES DOCKER ALLOW USTO SCALE ENGINEERING? 1. Development team has complete control over what they deploy 2. Core deployment can still be owned by a dedicated team as a first class concern 3. Small shared context needed for cross-team communication 1. allows us to scale teams out linearly without creating a centralized bottleneck 2. eliminates wasted duplicate effort and the effort of maintaining a substandard system 3. eliminates waste effort communicating complex requirements in an out of band way
  • 23. Test Build Deploy We’re going to run through the test-build-deploy pipeline at GameChanger. We’re using a separate service for each of those, so let’s introduce the cast of characters.
  • 24. Drone TheTest Runner Docker-based testing Fully isolated and concurrent tests Easily tests against service containers, e.g. Postgres
  • 25. Jenkins The Build Server In charge of building Docker containers, app packages, etc. Can do almost anything, but isolation not a strong suit
  • 26. Bagel The Deploy Service Manages versioning of deployable apps across environments One-click deploy, rollback mmm, bagel
  • 28. Test Build Deploy We’re going to go through what it takes to wire up an application to work with our pipeline…
  • 29. Test Build Deploy …and while we’re doing it, we’re going to highlight the ways in which Docker helps us achieve the goals that Tom was talking about earlier.
  • 30. Python + Postgres:A Simple Application Let’s consider a simple Python application that works with a Postgres database. It has a bunch of unit tests, including unit tests that require connecting to an actual Postgres instance to run.
  • 31. Test Removes some dependency setup concerns from application dev Tests are fully isolated, coupling is minimized Fast and parallelizable, multiple teams can work on a single app without slowing each other down Drone TheTest Runner Drone, as we mentioned before, is what we use to run tests. What are the benefits we get from using Drone?
  • 32. Drone uses a simpleYAML specification Test
  • 33. Test Drone uses a simpleYAML specification
  • 34. Application devs do not need to know how to install these Use official images or write a generalized image once, share with all your teams Test
  • 35. Server Host OS Each test run is fully isolated using containers Test
  • 36. Server Host OS Each test run is fully isolated using containers Parallel testing becomes trivial Fast testing of PRs reduces likelihood of breaking the build Test
  • 37. Build Receives Git hash and static dependencies from Drone Builds Docker images, pushes to our private Docker registry Jenkins The Build Server
  • 39. Tested application code (specific Git hash) Same library versions used to test that code, e.g. pip freeze, npm shrinkwrap All system libraries, drivers, etc. Build What exactly are we putting into our images? Note that our image will *not* contain service dependencies like the Postgres we want to run against. We have a few options for how to connect to a database at runtime.
  • 40. Drone PyPi Jenkins Registry Image Build So this is how we make the build work with Docker. Why is this actually better than the traditional model? Well…
  • 41. Drone PyPi Jenkins Registry Image Build X …what if we put a bullet in PyPi and are no longer able to get our library dependencies? Before we answer that, what happened in the old world? We’d deploy new code, our servers would all try to pull from PyPi, fail, and freak out. Is our site down? Are we up but with incorrect or partial dependencies? It’s not great. What happens with Docker?
  • 42. Drone PyPi Jenkins Registry Image Build X With Docker, that risk is moved from deploy time to build time. Jenkins will try to pull from PyPi and fail. It won’t be able to push a new image with your updated code. This is usually a good thing! You can have confidence that all the images you *do* have will have all their dependencies and be fully working images.
  • 43. Mostly a thin API on top of our Docker registry Also owns triggering deploys across our infrastructure Deploy Bagel The Deploy Service Bagel gives us a way to coordinate the images in our Docker registry with their corresponding Git tags, the dependencies that were baked into the images, etc. I think just go into a quick demo here, show some cool dependency diffs and PR messages or something.
  • 44.
  • 45. Deploy deploy.travisthieman.com Deploys triggered via gossip protocol Coordinated with distributed locks What happens when we hit the Deploy button in Bagel?
  • 46. All our machines run identical OS-level images Images and runtime config specified viaYAML Deploy mechanism on each machine reconciles spec and containers currently running Similar to Docker Compose All our boxes run off the same machine image (AMI). A YAML file specifying which of our apps should be deployed to that box is all that distinguishes it. Our pretty-dumb deploy scripts (triggered by Bagel) handle matching the running state of that box’s containers to what’s in this YAML file and the current deployed versions according to Bagel.
  • 47. Test Build Deploy That’s our deploy pipeline. Using Docker, we’ve seen significant gains in simplicity and developer productivity across our test, build, and deploy stages. Our feature teams can release new services with ease, and our Ops team has been phased out of existence. Our engineers are now free to focus on problems that benefit our customers and our business. Before I go, just a few closing thoughts on Docker as someone who’s spent a bit of time with it…
  • 48. Young Ecosystem RollYour Own? Build Process Intricacies Docker on OS X Development Environments

Notes de l'éditeur

  1. GameChanger’s market is amateur sports. Whereas ESPN caters to handful of top professional teams in the country, GameChanger provides free tools to the millions of amateur sports teams around the world.
  2. This graph shows the number of requests/second received by one of our services over the last week. The area under the graph is broken down by host. You can see that we are scaling our hosts up and down in response to demand. At GC we have an extremely spike traffic profile so using autoscaling is important to control costs. Therefore it’s very important not only to deploy new application code to existing servers but also to be able to very reliably build new servers with minimal risk.
  3. To illustrate the risks associated with a traditional Configuration Management approach to building servers, let’s look at the typical Chef architecture. This is a CM server which hosts the current valid configuration data for the cluster (and by configuration data we also mean setup scripts etc) The Developer is responsible for pushing new configuration to the CM server and then all app servers periodically pull and execute the latest scripts. Risks: - CM server is a SPOF. Chef is painful to scale out. - CM server needs to be scaled to support max conceivable cluster size (or we have problems when we need it most) - Thundering herd
  4. This is a visualization of GC’s role/recipe dependencies in Chef before we moved to Docker. Risks: - Spaghetti-like dependencies are impossible to reason about (what happens if I upgrade node.js?) - Dependencies are indirect and not explicit - Testing is expensive and time consuming. Devs are disincentivized from testing. - Coupling issues not discovered until deploy time -> can take down your cluster. - Rollback can be painful
  5. Deploy-time dependencies on multiple external repositories is a big risk. - Build AMIs (complex, heavy, time consuming, does not allow us to iterate fast enough) - Host you own mirror for services like PyPI. But then who owns the maintenance of that mirror?
  6. A less obvious problem with traditional CM approaches is how they inhibit the scaling of engineering. Let’s illustrate with an example…
  7. Like many companies, GC’s product started out as a small Python app developed by a couple of people. At this point we only have a few users so we can run on a couple of servers and deployment is simple manual step.
  8. As we build more features our application gets bigger and we hire more people to help build and maintain those features. We’re still doing some form of manual deployment at this point, and though it’s starting to become a bottleneck we’re still prioritizing feature development.
  9. We grow further and our application grows accumulating more and more responsibilities. The need to coordinate test + build + deploy necessitates an Ops team to own this problem.
  10. Following a more “Dev Ops” mantra, these responsibilities form more of a continuum. Devs care about getting their code to prod, Ops care about what the code does, both cooperate. Deploying a monolithic app in this way actually works pretty well. The tech stack is fairly static, and forms a shared context which minimizes the friction between dev and ops teams. The problem for GC was that this monolithic architecture scaled poorly for us: - Poor ownership boundaries - Quality of shared components suffered - Introducing new languages is difficult to sell - Different features have different CAP requirements - Operational problems derived from indirect coupling
  11. Solution: Teams own collections of independently-scalable microservices with clear ownership boundaries. Teams ar But this poses a problem for our previous deployment approach: - Suddenly Ops need to know how to deploy an ever growing list of technologies - Information friction between Dev and Ops is high as the context is dynamic - Deployment using something like Chef becomes more and more complex - As feature teams are added, Ops becomes a bottleneck, the relationship risks becoming adversarial
  12. Conway’s Law A collection of teams that design a system will inevitably produce a design which evolves from the minimum amount of out-of-band communication required between those teams.
  13. In the face of the need high-traffic high-complexity communication to get software deployed, teams will be motivated towards compartmentalizing the way they approach deployment. This is much better as the contextual footprint for each mini-Ops team is manageable. But there is still a problem here. We risk duplicating effort across teams on “core” deployment activities.
  14. All of these are important. Doing them well requires that they be owned and continuously improved and maintained as a first class system asset. On feature teams they will not be treated in this way, we’ll duplicate effort building several half-formed implementations of these things.
  15. We still needed an Ops Team to own the core parts of deployment, but needed a way to ensure the interface between the feature and Ops teams to have low information friction and not require the Ops team to understand n different tech stacks. We could have tried to use Chef to do this by making each team own its own roles, but you end up running into problems around shared dependencies, global state and indirect coupling. Docker provides a neat abstraction which allows these responsibilities to be separated clearly and scalably.
  16. allows us to scale teams out linearly without creating a centralized bottleneck eliminates wasted duplicate effort and the effort of maintaining a substandard system eliminates waste effort communicating complex requirements in an out of band way
  17. We’re going to run through the test-build-deploy pipeline at GameChanger. We’re using a separate service for each of those, so let’s introduce the cast of characters.
  18. mmm, bagel
  19. We’re going to go through what it takes to wire up an application to work with our pipeline…
  20. …and while we’re doing it, we’re going to highlight the ways in which Docker helps us achieve the goals that Tom was talking about earlier.
  21. Let’s consider a simple Python application that works with a Postgres database. It has a bunch of unit tests, including unit tests that require connecting to an actual Postgres instance to run.
  22. Drone, as we mentioned before, is what we use to run tests. What are the benefits we get from using Drone?
  23. What exactly are we putting into our images? Note that our image will *not* contain service dependencies like the Postgres we want to run against. We have a few options for how to connect to a database at runtime.
  24. So this is how we make the build work with Docker. Why is this actually better than the traditional model? Well…
  25. …what if we put a bullet in PyPi and are no longer able to get our library dependencies? Before we answer that, what happened in the old world? We’d deploy new code, our servers would all try to pull from PyPi, fail, and freak out. Is our site down? Are we up but with incorrect or partial dependencies? It’s not great. What happens with Docker?
  26. With Docker, that risk is moved from deploy time to build time. Jenkins will try to pull from PyPi and fail. It won’t be able to push a new image with your updated code. This is usually a good thing! You can have confidence that all the images you *do* have will have all their dependencies and be fully working images.
  27. Bagel gives us a way to coordinate the images in our Docker registry with their corresponding Git tags, the dependencies that were baked into the images, etc. I think just go into a quick demo here, show some cool dependency diffs and PR messages or something.
  28. What happens when we hit the Deploy button in Bagel?
  29. All our boxes run off the same machine image (AMI). A YAML file specifying which of our apps should be deployed to that box is all that distinguishes it. Our pretty-dumb deploy scripts (triggered by Bagel) handle matching the running state of that box’s containers to what’s in this YAML file and the current deployed versions according to Bagel.
  30. That’s our deploy pipeline. Using Docker, we’ve seen significant gains in simplicity and developer productivity across our test, build, and deploy stages. Our feature teams can release new services with ease, and our Ops team has been phased out of existence. Our engineers are now free to focus on problems that benefit our customers and our business. Before I go, just a few closing thoughts on Docker as someone who’s spent a bit of time with it…