SlideShare une entreprise Scribd logo
1  sur  56
Designing for the Cloud:A Tutorial Stuart Charlton, CTO, Elastra
Tutorial Objectives What has cloud computing done to IT systems design & architecture? “The future is already here, it’s just not very evenly distributed” (Gibson) How should new systems be designed with the new constraints? Such as:  parallelism, availability, on demand infra Where can I find are practical frameworks, tools, and techniques, and what are the tradeoffs? Hadoop, Cassandra, Parallel DBs, Actors, Caches, Containers, and Configuration Management
About Your Presenter Stuart Charlton ,[object Object],CTO, Elastra ,[object Object],In prior lives...  ,[object Object],RESTafarian and Data geek Stu Says Stuffhttp://stucharlton.com/blog
Tutorial Agenda, in 4 Words Clouds Service Data Control 4
Agenda – Part 1 Clouds: Fear of a Fluffy Planet What has changed, and what remains the same? Designing applications in this world A Cloud Design Reference Architecture (aka.  A cheat sheet to categorize thinking in the clouds) Service:  Foundations for Systems Solving Big Problems vs. Little Problems Amdahl’s Law & The Universal Scalability Law  Actor-Based Concurrency:  Dr. Strangelanguage, (or How I Learned to Stop Worrying and Love Erlang)
Agenda – Part 2 Data: Management & Access Contrasting Philosophies Persistence vs. Management; Scale-Up vs. Scale-Out Shared Disk vs. Shared Nothing A survey of solutions (from clustered DBMS to K/V stores) Consistency, Availability, Partitioning (CAP) Tradeoffs Deep dig into what these really imply Control: Containers, Configuration & Modeling The Dev/Ops Tennis Match The Evolution of Automation From Scripts to Runbooks to FSMs to HTNs
Caveats Audience Assumption:  IT Devs & Architects Some exposure to cloud, but not necessarily advanced The technology is a fast moving target Especially state of the specific tools & frameworks Theory vs. practice I try to balance the two; both are essential Time is limited Only scratching the surface of certain topics Missing topics are usually full tutorials in their own right Much of the subject matter is up for debate And, this is a tutorial, not a workshop…. 
Clouds Fear of a Fluffy Planet 8
(court (Courtesy of browsertoolkit.com)
The Freedom! On Demand Infrastructure via API calls Inside or outside my data centres (Private / Public Cloud) Pay-per-use pricing models Great for temporary growth needs Platform-as-a-Service Scalability without Skill, Availability without Avarice Large Scale, Always On New opportunities due to cheaper scale & availability
The Horror! Hype Overdrive Cloud Running Shoes!  Cloud Chewing Gum!  GOOG!Werner Vogels Action Figures!  (well, not quite yet) Standards Support So many to choose from! OCCI, vCloud + OVF, EC2, WBEM, WS-Management Platform-as-a-Service What color would you like for your locked trunk’s interior? Crazy Talk No SQL!  Eventual Consistency!  Infrastructure as Code!
Will the Real Slim Cloudy Please Stand Up? “I, for one, welcome our new  outsourced overlords” Finer-grained outsourcing Metered resource usage APIs & self-service UIs … but isn’t outsourcing often a shell game? See Distributed Computing Economics, Jim Gray (2003) “Scale without skill,   availability without avarice” Insert constrained code [here] Magically scalable & available GAE, Azure (some day) … but aren’t you locked in?
Will the Real Slim Cloudy Please Stand Up? “I like Big *aaS and I cannot lie” “My name is… what? Slim Cloudy!” Private, Public, or Community Clouds Multiple stack levels “Real” SOA, not just web services … haven’t I heard this before? Reduced lead times to change Agile Operations / Lean IT Revolution in systems management … can we really change IT?
Designing Applications in this World Distributed & networked systems have triumphed The fallacies must be taken seriously now Network is unreliable, latency > 0, bandwidth is finite, topology might change, etc. Scale-out & fault tolerance: the new design center Versus productive business logic, data management, etc. What’s old is new Some challengers to mainstream ideas are old ideas being reapplied e.g. Erlang, Map/Reduce, distributed file systems, replication
Designing Applications in this World Autonomous services constitute most systems Full-stack services, not just bits of code Design for constant operations Interdependence + Distribution + Autonomy = Pain FCAPS (Fault, Configuration, Accounting, Performance & Security Management)  Security & Privacy Multi-tenancy, data-in-transit vs. data-at-rest, etc.
Solving for one’s own problems Mainstream tools, platforms, and servers have not consistently caught up LOTS of software experimentation in: Web servers, containers, caches, databases, network configuration, systems management The danger is to view new solutions as the better way of doing things in general It’s possible; but stuff is changing quickly New territory always involves a level of reinvention The tech world has not rebooted due to cloud computing Beware Fanbois/Fangrrls, Pundits & The Press
A Cloud Design Reference Architecture Web – WebArch & REST Service, Data,& Control – this tutorial Resource –virtualization,management &infrastructure clouds WEB SERVICE DATA CONTROL RESOURCE
Service Organizing your computing domain for fault scale management WEB SERVICE DATA CONTROL RESOURCE
Data Storage, retrieval,integrity, recovery given Distributed systems Large scale High availability (possible) Multi-tenancy WEB SERVICE DATA CONTROL RESOURCE
Control Provision, configuration, governance, and optimization of infrastructure Resource brokerage Policy constraints Dependency management Software configuration Authorization & Auditability WEB SERVICE DATA CONTROL RESOURCE
Service Foundation for Systems
Designing a Service, circa 1998-2008 Multi-Tier Hybrid Architecture Some stateless, some stateful computing Session state is replicated Independent servers / applications Low-level redundancy (RAID, 2x NICs, etc.) “Put your eggs into a small number of baskets, and watch those baskets” General assumptions Failure at the service layer shouldn’t lead to downtime Failure at the data layer may be catastrophic
Designing a Service, circa 2008+ Autonomous services  Divide system into areas of functional responsibility (tiers irrelevant) Interdependent servers / applications Software-level redundancy andfault handling  “Many, many servers breaking big problems down or distributinglots of little problems around” New realities Partial failure is a regular, normal occurrence; no excuse for downtime from any service
Breaking or bridging a problem across resources Big Problems (Parallel) Theory:Amdahl’s lawShared memory or disk vs. Shared nothing New Practice:MapReduce (e.g. Hadoop), Spaces, Master/Worker Retro:  Linda, MPI, OpenMP, IPC or Threads Little Problems (Concurrent) Theory:  Actor-model & process calculi New Practice:   Lightweight Messaging, Spaces, Erlang & Scala Actors Retro:   IPC, Thread pools,Components (COM+/EJB),Big Messaging (MQ, TIB, JMS)
Case Study in “Big Problem” Solving:MapReduce & Apache Hadoop Input Read your data from files as a K/V map Distribute Mapping Function Input one (k,v) pair returns new K/V list Partition & Sort Handled by framework (eg. Hadoop) Provide a comparator Distribute Reduce Function Input one (k, list of values) pair Return a list of output values Output Save the list as a file
….But how fast can I get?Theory Interlude:  Amdahl’s Law How fast can I speed up a sequential process? Time = Serial part + Parallel part  Thus, the speed up is Where P is the % of the program that can be parallel N is the number of processors What happens when P is 95%? -- Maximum of 20x   How about 99.99%?
Gunther’s Universal Scalability Law It gets worse… Most scale-outexperiencesretrogradebehavior at peak loads Capacity(N)  =                                   N         1 + α (N − 1) + β N (N − 1)	 α is the contention  β is the coherency delay http://www.perfdynamics.com/Manifesto/gcaprules.html
Case study in solving “little problems”Actors:   The Basic Idea Programmable entities are concurrent, share nothing, communicate through messages Actors can Send messages Create other actors Specify how it responds to messages Very lightweight (actors = objects) Usually no ordering guarantees At the language level
ErlangSupervisors: Assuming failure will occur Failures require cleanup & restart Supervisor relationships canensure the systemtolerates faults Hot-swap patches Fundamentally inthe language libraries
What kinds of failures?  A Simplification. Exceptional Conditions Conditions that a programmer did not or should not handle Tolerated through replication, fast failure, and/or restart(s) Examples Hardware failures, network outages, “Heisenbugs”, rare software conditions Conditions that the programmer can handle Handled through cleanup or “catch” code Examples File not found, type conversion, bad arithmetic (divide by zero),malformed input Error Conditions
Data Management & Access
Evolving the Database:  Two Philosophies Data Persistence Systemsand Frameworks Database Management Systems(DBMS) Goal:  Store & retrieve data quickly, reliable, with minimal hassle to the programmer Often uses application tools & languages to manage & access data Focused set of features Goal:  Manage the access, integrity, security, and reliability of data, independently of applications Hard separation of tools & languages (e.g. SQL, DBA tools) Broad set of features
Scaling the Database:  Two Philosophies Scale-Up Scale-Out Concurrent processing & parallelism through hardware SMP, NUMA, MPP RAID Arrays (SAN & NAS) Shared disk or memory Benefit:  It worked in the 90s. Drawback:  Expensive, often bespoke, forklift upgrades Concurrent processing & parallelism through software Commodity hardware Software provides the engine Shared nothing Benefit:  Linear scale, easy to standardize, easy to replicate / upgrade Drawback:  Traditionally, the software sucked. 33
… What happens when database clustering software stops sucking? (i.e. now) A flurry of programmer-oriented approaches Persistence engines rule the bleeding edge in 2009 Key/Value Stores, JSON Document stores, etc. Declarative/Imperative impedance mismatch(the “Vietnam” of the software tools industry) gets conflated with distributed data Lots of practical confusion ,[object Object]
Too many choices, with idiosyncratic design histories
Let’s detangle this…34
When should I share components? Shared Disk Shared Nothing Partition compute across nodes Storage is shared through NAS or SAN Good for: Mixed workload Small random access reads Worst case: Inter-node network chatter caps scalability Disk pings to propagate writes (e.g. Oracle pre-RAC) Partition data across nodes Each node owns its data Good for: Read-mostly Parallel reads of huge data volumes Consistent writes go to one partition Worst case: Repartitioning Hotspot records don’t scale Writes that span partitions
Modern Data Persistence Systems  Object Persistence “Navigational databases in Java, Smalltalk, C++” GemStone, Versant, Objectivity Distributed Key-Value Stores “Structured data with lesser need for complex queries” Consistent:   BigTable, HBase, Voldemort Eventually Consistent:  Dynamo, Cassandra Document and/or Blob Stores “Indexed structured data + binaries/fulltext” CouchDB, BerkeleyDB, MongoDB
Clustered DBMS for Transactions Oracle Real Application Clusters (RAC) Shared disk, Replicated Memory (“Cache Fusion”) Limited by mesh interconnect to disk (partitioning possible) IBM DB2 Data Partitioning Feature Shared nothing database cluster, high number of nodes IBM DB2 pureScale New (Oct 2009) technology that ports IBM DB2 mainframe shared-disk clustering to the DB2 for open systems Microsoft SQL Server 2008 “Federated” Shared Nothing Database a longtime feature
Clustered DBMS for Parallel Queries Teradata The old standard data warehouse, hardware + software Netezza Data warehousing appliance (hw + software) Vertica Column-oriented, shared nothing clustered database Mike Stonebraker’s new company Greenplum Column-oriented, shared nothing clustered database Based on PostgreSQL with MapReduce engine
Scaling to Internet-Scale Single Control Domain One Database Site Consistency is built-in Scalable with tradeoffs among different workloads Scale to the limits of network bandwidth & manageability Main Example: Clustered DBMS Multiple Control Domains Many Database Sites Consistency requires agreement protocol Scalable only if consistency is relaxed Nearly limitless (global) scale Main Examples: DNS  The Web 39
How do I make consistency tradeoffs?Theory interlude:  The CAP theorem Consistency (A+C in ACID) There’s a total orderingon all operations on the data;i.e. like a sequence Availability Every request onnon-failed servers must havea response Tolerance to Network Partitions All messages might be lost between server nodes Choose at most two of these (as a spectrum).
CAP Tradeoffs:  Consistency & Availability ,[object Object]
 Fault tolerance through replicas   & fast fail + fast recovery ,[object Object]
 network outage between servers might halt the system
 generally requires a single domainof control
Examples that emphasize C+A:
 Single-site cluster databases
 Google BigTable
Hadoop’sHBase
 Oracle RAC, IBM DB2 Parallel
 Clustered file systems
Google File System & HDFS
 Distributed Spaces & Caches
Coherence, Gigaspaces & Terracotta,[object Object]
Implication:

Contenu connexe

Tendances

Infrastructure and Tooling - Full Stack Deep Learning
Infrastructure and Tooling - Full Stack Deep LearningInfrastructure and Tooling - Full Stack Deep Learning
Infrastructure and Tooling - Full Stack Deep LearningSergey Karayev
 
Above the Clouds: A View From Academia
Above the Clouds: A View From AcademiaAbove the Clouds: A View From Academia
Above the Clouds: A View From AcademiaEduserv
 
Going eXtreme for Healthcare
Going eXtreme for HealthcareGoing eXtreme for Healthcare
Going eXtreme for HealthcareKoen Vanderkimpen
 
2015 04 bio it world
2015 04 bio it world2015 04 bio it world
2015 04 bio it worldChris Dwan
 
High Performance Computing
High Performance ComputingHigh Performance Computing
High Performance ComputingDell World
 
From HadoopDB to Hadapt: A Case Study of Transitioning a VLDB paper into Real...
From HadoopDB to Hadapt: A Case Study of Transitioning a VLDB paper into Real...From HadoopDB to Hadapt: A Case Study of Transitioning a VLDB paper into Real...
From HadoopDB to Hadapt: A Case Study of Transitioning a VLDB paper into Real...Daniel Abadi
 
Oracle migrations and upgrades
Oracle migrations and upgradesOracle migrations and upgrades
Oracle migrations and upgradesDurga Gadiraju
 
Jon cohn exton pa corporate data architecture
Jon cohn exton pa   corporate data architectureJon cohn exton pa   corporate data architecture
Jon cohn exton pa corporate data architectureJon Cohn
 
ICPSR Secure Data Service: Broadening Access. Reducing Risk.
ICPSR Secure Data Service: Broadening Access. Reducing Risk.ICPSR Secure Data Service: Broadening Access. Reducing Risk.
ICPSR Secure Data Service: Broadening Access. Reducing Risk.ICPSR
 
Introduction to parallel iterative deep learning on hadoop’s next​ generation...
Introduction to parallel iterative deep learning on hadoop’s next​ generation...Introduction to parallel iterative deep learning on hadoop’s next​ generation...
Introduction to parallel iterative deep learning on hadoop’s next​ generation...Anh Le
 
Robin Gadd on Cloud Computing in FE at AoC Annual Conference Birmingham Nov 2010
Robin Gadd on Cloud Computing in FE at AoC Annual Conference Birmingham Nov 2010Robin Gadd on Cloud Computing in FE at AoC Annual Conference Birmingham Nov 2010
Robin Gadd on Cloud Computing in FE at AoC Annual Conference Birmingham Nov 2010robingadd
 
Adoption of Cloud Computing in Scientific Research
Adoption of Cloud Computing in Scientific ResearchAdoption of Cloud Computing in Scientific Research
Adoption of Cloud Computing in Scientific ResearchYehia El-khatib
 
PLNOG 17 - Shabbir Ahmad - Dell Open Networking i Big Monitoring Fabric: unik...
PLNOG 17 - Shabbir Ahmad - Dell Open Networking i Big Monitoring Fabric: unik...PLNOG 17 - Shabbir Ahmad - Dell Open Networking i Big Monitoring Fabric: unik...
PLNOG 17 - Shabbir Ahmad - Dell Open Networking i Big Monitoring Fabric: unik...PROIDEA
 

Tendances (14)

Infrastructure and Tooling - Full Stack Deep Learning
Infrastructure and Tooling - Full Stack Deep LearningInfrastructure and Tooling - Full Stack Deep Learning
Infrastructure and Tooling - Full Stack Deep Learning
 
Above the Clouds: A View From Academia
Above the Clouds: A View From AcademiaAbove the Clouds: A View From Academia
Above the Clouds: A View From Academia
 
Cloud Computing - Demystified
Cloud Computing - DemystifiedCloud Computing - Demystified
Cloud Computing - Demystified
 
Going eXtreme for Healthcare
Going eXtreme for HealthcareGoing eXtreme for Healthcare
Going eXtreme for Healthcare
 
2015 04 bio it world
2015 04 bio it world2015 04 bio it world
2015 04 bio it world
 
High Performance Computing
High Performance ComputingHigh Performance Computing
High Performance Computing
 
From HadoopDB to Hadapt: A Case Study of Transitioning a VLDB paper into Real...
From HadoopDB to Hadapt: A Case Study of Transitioning a VLDB paper into Real...From HadoopDB to Hadapt: A Case Study of Transitioning a VLDB paper into Real...
From HadoopDB to Hadapt: A Case Study of Transitioning a VLDB paper into Real...
 
Oracle migrations and upgrades
Oracle migrations and upgradesOracle migrations and upgrades
Oracle migrations and upgrades
 
Jon cohn exton pa corporate data architecture
Jon cohn exton pa   corporate data architectureJon cohn exton pa   corporate data architecture
Jon cohn exton pa corporate data architecture
 
ICPSR Secure Data Service: Broadening Access. Reducing Risk.
ICPSR Secure Data Service: Broadening Access. Reducing Risk.ICPSR Secure Data Service: Broadening Access. Reducing Risk.
ICPSR Secure Data Service: Broadening Access. Reducing Risk.
 
Introduction to parallel iterative deep learning on hadoop’s next​ generation...
Introduction to parallel iterative deep learning on hadoop’s next​ generation...Introduction to parallel iterative deep learning on hadoop’s next​ generation...
Introduction to parallel iterative deep learning on hadoop’s next​ generation...
 
Robin Gadd on Cloud Computing in FE at AoC Annual Conference Birmingham Nov 2010
Robin Gadd on Cloud Computing in FE at AoC Annual Conference Birmingham Nov 2010Robin Gadd on Cloud Computing in FE at AoC Annual Conference Birmingham Nov 2010
Robin Gadd on Cloud Computing in FE at AoC Annual Conference Birmingham Nov 2010
 
Adoption of Cloud Computing in Scientific Research
Adoption of Cloud Computing in Scientific ResearchAdoption of Cloud Computing in Scientific Research
Adoption of Cloud Computing in Scientific Research
 
PLNOG 17 - Shabbir Ahmad - Dell Open Networking i Big Monitoring Fabric: unik...
PLNOG 17 - Shabbir Ahmad - Dell Open Networking i Big Monitoring Fabric: unik...PLNOG 17 - Shabbir Ahmad - Dell Open Networking i Big Monitoring Fabric: unik...
PLNOG 17 - Shabbir Ahmad - Dell Open Networking i Big Monitoring Fabric: unik...
 

Similaire à Designing for the Cloud: A Practical Guide

UnConference for Georgia Southern Computer Science March 31, 2015
UnConference for Georgia Southern Computer Science March 31, 2015UnConference for Georgia Southern Computer Science March 31, 2015
UnConference for Georgia Southern Computer Science March 31, 2015Christopher Curtin
 
Nonrelational Databases
Nonrelational DatabasesNonrelational Databases
Nonrelational DatabasesUdi Bauman
 
Distributed Systems: scalability and high availability
Distributed Systems: scalability and high availabilityDistributed Systems: scalability and high availability
Distributed Systems: scalability and high availabilityRenato Lucindo
 
Bhupeshbansal bigdata
Bhupeshbansal bigdata Bhupeshbansal bigdata
Bhupeshbansal bigdata Bhupesh Bansal
 
Above the cloud joarder kamal
Above the cloud   joarder kamalAbove the cloud   joarder kamal
Above the cloud joarder kamalJoarder Kamal
 
Waters Grid & HPC Course
Waters Grid & HPC CourseWaters Grid & HPC Course
Waters Grid & HPC Coursejimliddle
 
Compare Clustering Methods for MS SQL Server
Compare Clustering Methods for MS SQL ServerCompare Clustering Methods for MS SQL Server
Compare Clustering Methods for MS SQL ServerAlexDepo
 
One Size Doesn't Fit All: The New Database Revolution
One Size Doesn't Fit All: The New Database RevolutionOne Size Doesn't Fit All: The New Database Revolution
One Size Doesn't Fit All: The New Database Revolutionmark madsen
 
Distributed Computing & MapReduce
Distributed Computing & MapReduceDistributed Computing & MapReduce
Distributed Computing & MapReducecoolmirza143
 
Melbourne Microservices Meetup: Agenda for a new Architecture
Melbourne Microservices Meetup: Agenda for a new ArchitectureMelbourne Microservices Meetup: Agenda for a new Architecture
Melbourne Microservices Meetup: Agenda for a new ArchitectureSaul Caganoff
 
scale_perf_best_practices
scale_perf_best_practicesscale_perf_best_practices
scale_perf_best_practiceswebuploader
 
Front Range PHP NoSQL Databases
Front Range PHP NoSQL DatabasesFront Range PHP NoSQL Databases
Front Range PHP NoSQL DatabasesJon Meredith
 
Data Applications and Infrastructure at LinkedIn__HadoopSummit2010
Data Applications and Infrastructure at LinkedIn__HadoopSummit2010Data Applications and Infrastructure at LinkedIn__HadoopSummit2010
Data Applications and Infrastructure at LinkedIn__HadoopSummit2010Yahoo Developer Network
 
From Agile Development to Agile Operations (QCon SF 2009)
From Agile Development to Agile Operations (QCon SF 2009)From Agile Development to Agile Operations (QCon SF 2009)
From Agile Development to Agile Operations (QCon SF 2009)Stuart Charlton
 
Cloud as a Data Platform
Cloud as a Data PlatformCloud as a Data Platform
Cloud as a Data PlatformAndrei Savu
 
Voldemort & Hadoop @ Linkedin, Hadoop User Group Jan 2010
Voldemort & Hadoop @ Linkedin, Hadoop User Group Jan 2010Voldemort & Hadoop @ Linkedin, Hadoop User Group Jan 2010
Voldemort & Hadoop @ Linkedin, Hadoop User Group Jan 2010Bhupesh Bansal
 
Hadoop and Voldemort @ LinkedIn
Hadoop and Voldemort @ LinkedInHadoop and Voldemort @ LinkedIn
Hadoop and Voldemort @ LinkedInHadoop User Group
 
Datacenter Computing with Apache Mesos - BigData DC
Datacenter Computing with Apache Mesos - BigData DCDatacenter Computing with Apache Mesos - BigData DC
Datacenter Computing with Apache Mesos - BigData DCPaco Nathan
 

Similaire à Designing for the Cloud: A Practical Guide (20)

UnConference for Georgia Southern Computer Science March 31, 2015
UnConference for Georgia Southern Computer Science March 31, 2015UnConference for Georgia Southern Computer Science March 31, 2015
UnConference for Georgia Southern Computer Science March 31, 2015
 
Nonrelational Databases
Nonrelational DatabasesNonrelational Databases
Nonrelational Databases
 
Distributed Systems: scalability and high availability
Distributed Systems: scalability and high availabilityDistributed Systems: scalability and high availability
Distributed Systems: scalability and high availability
 
Bhupeshbansal bigdata
Bhupeshbansal bigdata Bhupeshbansal bigdata
Bhupeshbansal bigdata
 
Above the cloud joarder kamal
Above the cloud   joarder kamalAbove the cloud   joarder kamal
Above the cloud joarder kamal
 
Waters Grid & HPC Course
Waters Grid & HPC CourseWaters Grid & HPC Course
Waters Grid & HPC Course
 
Compare Clustering Methods for MS SQL Server
Compare Clustering Methods for MS SQL ServerCompare Clustering Methods for MS SQL Server
Compare Clustering Methods for MS SQL Server
 
One Size Doesn't Fit All: The New Database Revolution
One Size Doesn't Fit All: The New Database RevolutionOne Size Doesn't Fit All: The New Database Revolution
One Size Doesn't Fit All: The New Database Revolution
 
Distributed Computing & MapReduce
Distributed Computing & MapReduceDistributed Computing & MapReduce
Distributed Computing & MapReduce
 
Above The Clouds
Above The CloudsAbove The Clouds
Above The Clouds
 
Melbourne Microservices Meetup: Agenda for a new Architecture
Melbourne Microservices Meetup: Agenda for a new ArchitectureMelbourne Microservices Meetup: Agenda for a new Architecture
Melbourne Microservices Meetup: Agenda for a new Architecture
 
scale_perf_best_practices
scale_perf_best_practicesscale_perf_best_practices
scale_perf_best_practices
 
Front Range PHP NoSQL Databases
Front Range PHP NoSQL DatabasesFront Range PHP NoSQL Databases
Front Range PHP NoSQL Databases
 
Data Applications and Infrastructure at LinkedIn__HadoopSummit2010
Data Applications and Infrastructure at LinkedIn__HadoopSummit2010Data Applications and Infrastructure at LinkedIn__HadoopSummit2010
Data Applications and Infrastructure at LinkedIn__HadoopSummit2010
 
From Agile Development to Agile Operations (QCon SF 2009)
From Agile Development to Agile Operations (QCon SF 2009)From Agile Development to Agile Operations (QCon SF 2009)
From Agile Development to Agile Operations (QCon SF 2009)
 
Cloud as a Data Platform
Cloud as a Data PlatformCloud as a Data Platform
Cloud as a Data Platform
 
Voldemort & Hadoop @ Linkedin, Hadoop User Group Jan 2010
Voldemort & Hadoop @ Linkedin, Hadoop User Group Jan 2010Voldemort & Hadoop @ Linkedin, Hadoop User Group Jan 2010
Voldemort & Hadoop @ Linkedin, Hadoop User Group Jan 2010
 
Hadoop and Voldemort @ LinkedIn
Hadoop and Voldemort @ LinkedInHadoop and Voldemort @ LinkedIn
Hadoop and Voldemort @ LinkedIn
 
Datacenter Computing with Apache Mesos - BigData DC
Datacenter Computing with Apache Mesos - BigData DCDatacenter Computing with Apache Mesos - BigData DC
Datacenter Computing with Apache Mesos - BigData DC
 
Hadoop basics
Hadoop basicsHadoop basics
Hadoop basics
 

Plus de Stuart Charlton

Applied tactics for your transformation
Applied tactics for your transformationApplied tactics for your transformation
Applied tactics for your transformationStuart Charlton
 
Cloud Foundry Vancouver Meetup July 2016
Cloud Foundry Vancouver Meetup July 2016Cloud Foundry Vancouver Meetup July 2016
Cloud Foundry Vancouver Meetup July 2016Stuart Charlton
 
Platform Clouds, Containers, Immutable Infrastructure Oh My!
Platform Clouds, Containers, Immutable Infrastructure Oh My!Platform Clouds, Containers, Immutable Infrastructure Oh My!
Platform Clouds, Containers, Immutable Infrastructure Oh My!Stuart Charlton
 
The Cloud Foundry Story on OpenStack
The Cloud Foundry Story on OpenStackThe Cloud Foundry Story on OpenStack
The Cloud Foundry Story on OpenStackStuart Charlton
 
Deploying to Production 50+ Times a Day - Calgary Agile Users Group 2015
Deploying to Production 50+ Times a Day - Calgary Agile Users Group 2015Deploying to Production 50+ Times a Day - Calgary Agile Users Group 2015
Deploying to Production 50+ Times a Day - Calgary Agile Users Group 2015Stuart Charlton
 
Speeding up enterprises, one deploy at a time - Devopsdays Toronto 2014
Speeding up enterprises, one deploy at a time - Devopsdays Toronto 2014Speeding up enterprises, one deploy at a time - Devopsdays Toronto 2014
Speeding up enterprises, one deploy at a time - Devopsdays Toronto 2014Stuart Charlton
 
Linking Data and Actions on the Web
Linking Data and Actions on the WebLinking Data and Actions on the Web
Linking Data and Actions on the WebStuart Charlton
 
I'll See You On the Write Side of the Web
I'll See You On the Write Side of the WebI'll See You On the Write Side of the Web
I'll See You On the Write Side of the WebStuart Charlton
 
OOPSLA Cloud Workshop - Designing for the Cloud (Elastra)
OOPSLA Cloud Workshop - Designing for the Cloud (Elastra)OOPSLA Cloud Workshop - Designing for the Cloud (Elastra)
OOPSLA Cloud Workshop - Designing for the Cloud (Elastra)Stuart Charlton
 
Software Licensing In The Cloud (CloudWorld 2009)
Software Licensing In The Cloud  (CloudWorld 2009)Software Licensing In The Cloud  (CloudWorld 2009)
Software Licensing In The Cloud (CloudWorld 2009)Stuart Charlton
 
Designing Enterprise IT Systems with REST - QCon San Francisco 2008
Designing Enterprise IT Systems with REST - QCon San Francisco 2008Designing Enterprise IT Systems with REST - QCon San Francisco 2008
Designing Enterprise IT Systems with REST - QCon San Francisco 2008Stuart Charlton
 
Cloud Computing and the Next-Generation of Enterprise Architecture - Cloud Co...
Cloud Computing and the Next-Generation of Enterprise Architecture - Cloud Co...Cloud Computing and the Next-Generation of Enterprise Architecture - Cloud Co...
Cloud Computing and the Next-Generation of Enterprise Architecture - Cloud Co...Stuart Charlton
 
Cloud Computing for Developers and Architects - QCon 2008 Tutorial
Cloud Computing for Developers and Architects - QCon 2008 TutorialCloud Computing for Developers and Architects - QCon 2008 Tutorial
Cloud Computing for Developers and Architects - QCon 2008 TutorialStuart Charlton
 
Oopsla 2007 - The Web: Distributed Objects Realized!
Oopsla 2007 - The Web: Distributed Objects Realized!Oopsla 2007 - The Web: Distributed Objects Realized!
Oopsla 2007 - The Web: Distributed Objects Realized!Stuart Charlton
 

Plus de Stuart Charlton (14)

Applied tactics for your transformation
Applied tactics for your transformationApplied tactics for your transformation
Applied tactics for your transformation
 
Cloud Foundry Vancouver Meetup July 2016
Cloud Foundry Vancouver Meetup July 2016Cloud Foundry Vancouver Meetup July 2016
Cloud Foundry Vancouver Meetup July 2016
 
Platform Clouds, Containers, Immutable Infrastructure Oh My!
Platform Clouds, Containers, Immutable Infrastructure Oh My!Platform Clouds, Containers, Immutable Infrastructure Oh My!
Platform Clouds, Containers, Immutable Infrastructure Oh My!
 
The Cloud Foundry Story on OpenStack
The Cloud Foundry Story on OpenStackThe Cloud Foundry Story on OpenStack
The Cloud Foundry Story on OpenStack
 
Deploying to Production 50+ Times a Day - Calgary Agile Users Group 2015
Deploying to Production 50+ Times a Day - Calgary Agile Users Group 2015Deploying to Production 50+ Times a Day - Calgary Agile Users Group 2015
Deploying to Production 50+ Times a Day - Calgary Agile Users Group 2015
 
Speeding up enterprises, one deploy at a time - Devopsdays Toronto 2014
Speeding up enterprises, one deploy at a time - Devopsdays Toronto 2014Speeding up enterprises, one deploy at a time - Devopsdays Toronto 2014
Speeding up enterprises, one deploy at a time - Devopsdays Toronto 2014
 
Linking Data and Actions on the Web
Linking Data and Actions on the WebLinking Data and Actions on the Web
Linking Data and Actions on the Web
 
I'll See You On the Write Side of the Web
I'll See You On the Write Side of the WebI'll See You On the Write Side of the Web
I'll See You On the Write Side of the Web
 
OOPSLA Cloud Workshop - Designing for the Cloud (Elastra)
OOPSLA Cloud Workshop - Designing for the Cloud (Elastra)OOPSLA Cloud Workshop - Designing for the Cloud (Elastra)
OOPSLA Cloud Workshop - Designing for the Cloud (Elastra)
 
Software Licensing In The Cloud (CloudWorld 2009)
Software Licensing In The Cloud  (CloudWorld 2009)Software Licensing In The Cloud  (CloudWorld 2009)
Software Licensing In The Cloud (CloudWorld 2009)
 
Designing Enterprise IT Systems with REST - QCon San Francisco 2008
Designing Enterprise IT Systems with REST - QCon San Francisco 2008Designing Enterprise IT Systems with REST - QCon San Francisco 2008
Designing Enterprise IT Systems with REST - QCon San Francisco 2008
 
Cloud Computing and the Next-Generation of Enterprise Architecture - Cloud Co...
Cloud Computing and the Next-Generation of Enterprise Architecture - Cloud Co...Cloud Computing and the Next-Generation of Enterprise Architecture - Cloud Co...
Cloud Computing and the Next-Generation of Enterprise Architecture - Cloud Co...
 
Cloud Computing for Developers and Architects - QCon 2008 Tutorial
Cloud Computing for Developers and Architects - QCon 2008 TutorialCloud Computing for Developers and Architects - QCon 2008 Tutorial
Cloud Computing for Developers and Architects - QCon 2008 Tutorial
 
Oopsla 2007 - The Web: Distributed Objects Realized!
Oopsla 2007 - The Web: Distributed Objects Realized!Oopsla 2007 - The Web: Distributed Objects Realized!
Oopsla 2007 - The Web: Distributed Objects Realized!
 

Dernier

Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 3652toLead Limited
 
DSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine TuningDSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine TuningLars Bell
 
A Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptxA Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptxLoriGlavin3
 
Take control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteTake control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteDianaGray10
 
From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .Alan Dix
 
SALESFORCE EDUCATION CLOUD | FEXLE SERVICES
SALESFORCE EDUCATION CLOUD | FEXLE SERVICESSALESFORCE EDUCATION CLOUD | FEXLE SERVICES
SALESFORCE EDUCATION CLOUD | FEXLE SERVICESmohitsingh558521
 
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdf
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdfHyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdf
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdfPrecisely
 
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxMerck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxLoriGlavin3
 
What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024Stephanie Beckett
 
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptxUse of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptxLoriGlavin3
 
SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024Lorenzo Miniero
 
How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.Curtis Poe
 
Advanced Computer Architecture – An Introduction
Advanced Computer Architecture – An IntroductionAdvanced Computer Architecture – An Introduction
Advanced Computer Architecture – An IntroductionDilum Bandara
 
"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii SoldatenkoFwdays
 
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024BookNet Canada
 
Streamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupStreamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupFlorian Wilhelm
 
TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024Lonnie McRorey
 
Unraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfUnraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfAlex Barbosa Coqueiro
 
DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsDevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsSergiu Bodiu
 
WordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your BrandWordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your Brandgvaughan
 

Dernier (20)

Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365
 
DSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine TuningDSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine Tuning
 
A Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptxA Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptx
 
Take control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteTake control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test Suite
 
From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .
 
SALESFORCE EDUCATION CLOUD | FEXLE SERVICES
SALESFORCE EDUCATION CLOUD | FEXLE SERVICESSALESFORCE EDUCATION CLOUD | FEXLE SERVICES
SALESFORCE EDUCATION CLOUD | FEXLE SERVICES
 
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdf
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdfHyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdf
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdf
 
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxMerck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
 
What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024
 
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptxUse of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
 
SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024
 
How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.
 
Advanced Computer Architecture – An Introduction
Advanced Computer Architecture – An IntroductionAdvanced Computer Architecture – An Introduction
Advanced Computer Architecture – An Introduction
 
"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko
 
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
 
Streamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupStreamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project Setup
 
TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024
 
Unraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfUnraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdf
 
DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsDevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platforms
 
WordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your BrandWordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your Brand
 

Designing for the Cloud: A Practical Guide

  • 1. Designing for the Cloud:A Tutorial Stuart Charlton, CTO, Elastra
  • 2. Tutorial Objectives What has cloud computing done to IT systems design & architecture? “The future is already here, it’s just not very evenly distributed” (Gibson) How should new systems be designed with the new constraints? Such as: parallelism, availability, on demand infra Where can I find are practical frameworks, tools, and techniques, and what are the tradeoffs? Hadoop, Cassandra, Parallel DBs, Actors, Caches, Containers, and Configuration Management
  • 3.
  • 4. Tutorial Agenda, in 4 Words Clouds Service Data Control 4
  • 5. Agenda – Part 1 Clouds: Fear of a Fluffy Planet What has changed, and what remains the same? Designing applications in this world A Cloud Design Reference Architecture (aka. A cheat sheet to categorize thinking in the clouds) Service: Foundations for Systems Solving Big Problems vs. Little Problems Amdahl’s Law & The Universal Scalability Law Actor-Based Concurrency: Dr. Strangelanguage, (or How I Learned to Stop Worrying and Love Erlang)
  • 6. Agenda – Part 2 Data: Management & Access Contrasting Philosophies Persistence vs. Management; Scale-Up vs. Scale-Out Shared Disk vs. Shared Nothing A survey of solutions (from clustered DBMS to K/V stores) Consistency, Availability, Partitioning (CAP) Tradeoffs Deep dig into what these really imply Control: Containers, Configuration & Modeling The Dev/Ops Tennis Match The Evolution of Automation From Scripts to Runbooks to FSMs to HTNs
  • 7. Caveats Audience Assumption: IT Devs & Architects Some exposure to cloud, but not necessarily advanced The technology is a fast moving target Especially state of the specific tools & frameworks Theory vs. practice I try to balance the two; both are essential Time is limited Only scratching the surface of certain topics Missing topics are usually full tutorials in their own right Much of the subject matter is up for debate And, this is a tutorial, not a workshop…. 
  • 8. Clouds Fear of a Fluffy Planet 8
  • 9. (court (Courtesy of browsertoolkit.com)
  • 10. The Freedom! On Demand Infrastructure via API calls Inside or outside my data centres (Private / Public Cloud) Pay-per-use pricing models Great for temporary growth needs Platform-as-a-Service Scalability without Skill, Availability without Avarice Large Scale, Always On New opportunities due to cheaper scale & availability
  • 11. The Horror! Hype Overdrive Cloud Running Shoes! Cloud Chewing Gum! GOOG!Werner Vogels Action Figures! (well, not quite yet) Standards Support So many to choose from! OCCI, vCloud + OVF, EC2, WBEM, WS-Management Platform-as-a-Service What color would you like for your locked trunk’s interior? Crazy Talk No SQL! Eventual Consistency! Infrastructure as Code!
  • 12. Will the Real Slim Cloudy Please Stand Up? “I, for one, welcome our new outsourced overlords” Finer-grained outsourcing Metered resource usage APIs & self-service UIs … but isn’t outsourcing often a shell game? See Distributed Computing Economics, Jim Gray (2003) “Scale without skill, availability without avarice” Insert constrained code [here] Magically scalable & available GAE, Azure (some day) … but aren’t you locked in?
  • 13. Will the Real Slim Cloudy Please Stand Up? “I like Big *aaS and I cannot lie” “My name is… what? Slim Cloudy!” Private, Public, or Community Clouds Multiple stack levels “Real” SOA, not just web services … haven’t I heard this before? Reduced lead times to change Agile Operations / Lean IT Revolution in systems management … can we really change IT?
  • 14. Designing Applications in this World Distributed & networked systems have triumphed The fallacies must be taken seriously now Network is unreliable, latency > 0, bandwidth is finite, topology might change, etc. Scale-out & fault tolerance: the new design center Versus productive business logic, data management, etc. What’s old is new Some challengers to mainstream ideas are old ideas being reapplied e.g. Erlang, Map/Reduce, distributed file systems, replication
  • 15. Designing Applications in this World Autonomous services constitute most systems Full-stack services, not just bits of code Design for constant operations Interdependence + Distribution + Autonomy = Pain FCAPS (Fault, Configuration, Accounting, Performance & Security Management) Security & Privacy Multi-tenancy, data-in-transit vs. data-at-rest, etc.
  • 16. Solving for one’s own problems Mainstream tools, platforms, and servers have not consistently caught up LOTS of software experimentation in: Web servers, containers, caches, databases, network configuration, systems management The danger is to view new solutions as the better way of doing things in general It’s possible; but stuff is changing quickly New territory always involves a level of reinvention The tech world has not rebooted due to cloud computing Beware Fanbois/Fangrrls, Pundits & The Press
  • 17. A Cloud Design Reference Architecture Web – WebArch & REST Service, Data,& Control – this tutorial Resource –virtualization,management &infrastructure clouds WEB SERVICE DATA CONTROL RESOURCE
  • 18. Service Organizing your computing domain for fault scale management WEB SERVICE DATA CONTROL RESOURCE
  • 19. Data Storage, retrieval,integrity, recovery given Distributed systems Large scale High availability (possible) Multi-tenancy WEB SERVICE DATA CONTROL RESOURCE
  • 20. Control Provision, configuration, governance, and optimization of infrastructure Resource brokerage Policy constraints Dependency management Software configuration Authorization & Auditability WEB SERVICE DATA CONTROL RESOURCE
  • 22. Designing a Service, circa 1998-2008 Multi-Tier Hybrid Architecture Some stateless, some stateful computing Session state is replicated Independent servers / applications Low-level redundancy (RAID, 2x NICs, etc.) “Put your eggs into a small number of baskets, and watch those baskets” General assumptions Failure at the service layer shouldn’t lead to downtime Failure at the data layer may be catastrophic
  • 23. Designing a Service, circa 2008+ Autonomous services Divide system into areas of functional responsibility (tiers irrelevant) Interdependent servers / applications Software-level redundancy andfault handling “Many, many servers breaking big problems down or distributinglots of little problems around” New realities Partial failure is a regular, normal occurrence; no excuse for downtime from any service
  • 24. Breaking or bridging a problem across resources Big Problems (Parallel) Theory:Amdahl’s lawShared memory or disk vs. Shared nothing New Practice:MapReduce (e.g. Hadoop), Spaces, Master/Worker Retro: Linda, MPI, OpenMP, IPC or Threads Little Problems (Concurrent) Theory: Actor-model & process calculi New Practice: Lightweight Messaging, Spaces, Erlang & Scala Actors Retro: IPC, Thread pools,Components (COM+/EJB),Big Messaging (MQ, TIB, JMS)
  • 25. Case Study in “Big Problem” Solving:MapReduce & Apache Hadoop Input Read your data from files as a K/V map Distribute Mapping Function Input one (k,v) pair returns new K/V list Partition & Sort Handled by framework (eg. Hadoop) Provide a comparator Distribute Reduce Function Input one (k, list of values) pair Return a list of output values Output Save the list as a file
  • 26. ….But how fast can I get?Theory Interlude: Amdahl’s Law How fast can I speed up a sequential process? Time = Serial part + Parallel part Thus, the speed up is Where P is the % of the program that can be parallel N is the number of processors What happens when P is 95%? -- Maximum of 20x How about 99.99%?
  • 27. Gunther’s Universal Scalability Law It gets worse… Most scale-outexperiencesretrogradebehavior at peak loads Capacity(N)  =   N 1 + α (N − 1) + β N (N − 1) α is the contention β is the coherency delay http://www.perfdynamics.com/Manifesto/gcaprules.html
  • 28. Case study in solving “little problems”Actors: The Basic Idea Programmable entities are concurrent, share nothing, communicate through messages Actors can Send messages Create other actors Specify how it responds to messages Very lightweight (actors = objects) Usually no ordering guarantees At the language level
  • 29. ErlangSupervisors: Assuming failure will occur Failures require cleanup & restart Supervisor relationships canensure the systemtolerates faults Hot-swap patches Fundamentally inthe language libraries
  • 30. What kinds of failures? A Simplification. Exceptional Conditions Conditions that a programmer did not or should not handle Tolerated through replication, fast failure, and/or restart(s) Examples Hardware failures, network outages, “Heisenbugs”, rare software conditions Conditions that the programmer can handle Handled through cleanup or “catch” code Examples File not found, type conversion, bad arithmetic (divide by zero),malformed input Error Conditions
  • 32. Evolving the Database: Two Philosophies Data Persistence Systemsand Frameworks Database Management Systems(DBMS) Goal: Store & retrieve data quickly, reliable, with minimal hassle to the programmer Often uses application tools & languages to manage & access data Focused set of features Goal: Manage the access, integrity, security, and reliability of data, independently of applications Hard separation of tools & languages (e.g. SQL, DBA tools) Broad set of features
  • 33. Scaling the Database: Two Philosophies Scale-Up Scale-Out Concurrent processing & parallelism through hardware SMP, NUMA, MPP RAID Arrays (SAN & NAS) Shared disk or memory Benefit: It worked in the 90s. Drawback: Expensive, often bespoke, forklift upgrades Concurrent processing & parallelism through software Commodity hardware Software provides the engine Shared nothing Benefit: Linear scale, easy to standardize, easy to replicate / upgrade Drawback: Traditionally, the software sucked. 33
  • 34.
  • 35. Too many choices, with idiosyncratic design histories
  • 37. When should I share components? Shared Disk Shared Nothing Partition compute across nodes Storage is shared through NAS or SAN Good for: Mixed workload Small random access reads Worst case: Inter-node network chatter caps scalability Disk pings to propagate writes (e.g. Oracle pre-RAC) Partition data across nodes Each node owns its data Good for: Read-mostly Parallel reads of huge data volumes Consistent writes go to one partition Worst case: Repartitioning Hotspot records don’t scale Writes that span partitions
  • 38. Modern Data Persistence Systems Object Persistence “Navigational databases in Java, Smalltalk, C++” GemStone, Versant, Objectivity Distributed Key-Value Stores “Structured data with lesser need for complex queries” Consistent: BigTable, HBase, Voldemort Eventually Consistent: Dynamo, Cassandra Document and/or Blob Stores “Indexed structured data + binaries/fulltext” CouchDB, BerkeleyDB, MongoDB
  • 39. Clustered DBMS for Transactions Oracle Real Application Clusters (RAC) Shared disk, Replicated Memory (“Cache Fusion”) Limited by mesh interconnect to disk (partitioning possible) IBM DB2 Data Partitioning Feature Shared nothing database cluster, high number of nodes IBM DB2 pureScale New (Oct 2009) technology that ports IBM DB2 mainframe shared-disk clustering to the DB2 for open systems Microsoft SQL Server 2008 “Federated” Shared Nothing Database a longtime feature
  • 40. Clustered DBMS for Parallel Queries Teradata The old standard data warehouse, hardware + software Netezza Data warehousing appliance (hw + software) Vertica Column-oriented, shared nothing clustered database Mike Stonebraker’s new company Greenplum Column-oriented, shared nothing clustered database Based on PostgreSQL with MapReduce engine
  • 41. Scaling to Internet-Scale Single Control Domain One Database Site Consistency is built-in Scalable with tradeoffs among different workloads Scale to the limits of network bandwidth & manageability Main Example: Clustered DBMS Multiple Control Domains Many Database Sites Consistency requires agreement protocol Scalable only if consistency is relaxed Nearly limitless (global) scale Main Examples: DNS The Web 39
  • 42. How do I make consistency tradeoffs?Theory interlude: The CAP theorem Consistency (A+C in ACID) There’s a total orderingon all operations on the data;i.e. like a sequence Availability Every request onnon-failed servers must havea response Tolerance to Network Partitions All messages might be lost between server nodes Choose at most two of these (as a spectrum).
  • 43.
  • 44.
  • 45. network outage between servers might halt the system
  • 46. generally requires a single domainof control
  • 51. Oracle RAC, IBM DB2 Parallel
  • 52. Clustered file systems
  • 55.
  • 57. multiple domains of control
  • 58. clients can’t always read/write
  • 59. failures degrade scale & performance due to negotiation
  • 61. Distributed shared nothing databases
  • 63. Distributed locks & file systems
  • 64. Chubby & Hadoop’sZooKeeper
  • 65. Paxos & consensus protocols
  • 66.
  • 68. multiple domains of control
  • 69. reads & writes always succeed(eventually)
  • 70. clients may read inconsistent (old or undone) data
  • 73. Web Caching & Content Delivery Networks
  • 74. Amazon Dynamo (and clones)
  • 75. Cassandra (Facebook, Digg)
  • 77.
  • 78. Please don’t throw out logical/relationaldata design! (unless you have to) “Future users of massive datasets should be protected from having to know how the data is organized in the computing cloud…. …. Activities of users through web agents and most application programs should remain unaffected when the internal representation of data is changed and even when some aspects of the external representation are changed.” Paraphrasing Ed Codd – 39 years ago!
  • 80. The Dev / Ops Game
  • 81. Example:Why can’t these two servers communicate? Possible areas of problems Security Bad credentials Server Configuration Wrong IP or Port Bad setup to listen or call Network Configuration Wrong duplex Bad DNS or DHCP Firewall Configuration Ports or protocols not open
  • 82. Example:What do I need to do to make this change? Desired Change Scale-out this cluster But… Impacts on other systems Security Systems Load Balancers Monitoring CMDB / Service Desk Architecture issues Stateful or stateless nodes Repartitioning? Limits/constraints on scale out? 49
  • 83. Example:What is the authoritative reality? Desired State Configuration Template Model Script Workflow CMDB Code Current State On the server Might not be in a file Might get changed at runtime And when you do change… It may not actually change It might change to an undesirable setting It might affect other settings that you didn’t think about 50
  • 84. Configuration Code, Files, and Models Bottom Up Scripts & Recipes Hand-grown automation Runbooks Workflow, policy Frameworks Chef Puppet, Cfengine Build Dependency Systems Maven Top Down Modeled Viewpoints E.g. Microsoft Oslo, UML, Enterprise Architecture Modular Containers E.g. OSGi, Spring, Azure roles Configuration Models SML, CIM ECML , EDML
  • 85. An Evolution of Automation Scripts For automating common cases Run-Book Automation Scripts as visual workflow Declarative Separate what you want from how you want it done Finite State Machines Organize scripts into described states & transitions Hierarchical Task Networks (Planning) Assemble a plan by exploring hypothetical strategic paths
  • 86. An Approach to Integrated Design and Ops 53
  • 87. Wrap-up Cloudy, with a chance of …
  • 88. Revisiting the Cloud Design Reference Architecture Service – Big vs. Little ProblemsMapReduce & ActorsAmdahl’s Law Data – persistence vs. mgtscale-up vs. scale-outCAP tradeoffs Control –containers, configuration, automation WEB SERVICE DATA CONTROL RESOURCE
  • 89. For More Information Hadoop http://hadoop.apache.org/ CAP Theorem Proof Paper http://people.csail.mit.edu/sethg/pubs/BrewersConjecture-SigAct.pdf Google’s papers on Distributed & Parallel Computing http://research.google.com/pubs/DistributedSystemsandParallelComputing.html Neil Gunther’s “Taking the Pith out of Performance” Blog http://perfdynamics.blogspot.com/ A Comparison of Approaches to Large-Scale Data Analytics http://database.cs.brown.edu/sigmod09/benchmarks-sigmod09.pdf Model-Driven Operations for the Cloud http://www.stucharlton.com/stuff/oopsla09.pdf