SlideShare une entreprise Scribd logo
1  sur  31
I cannot cover
                                           Distributed Systems
                                              in 30 minutes!



But, I can tell why
you might want to
 learn Distributed
   Systems in 30
     minutes!
        http://www.flickr.com/photos/uwehermann/82753155/sizes/m/in/photostream/ and
            http://www.flickr.com/photos/peterpearson/5921765552, licensed under CC
What is a Distributed System?
              "A distributed system is
             one on which I cannot get
              any work done because
               some machine I have
                never heard of has
                      crashed.“
                  --Leslie Lamport
What is a Distributed System?




“A system in which hardware or   “A distributed system is a
software components located
                                 collection of independent
at networked computers
communicate and coordinate
                                 computers that appear to the
their actions only by message    users of the system as a single
passing.” - [Coulouris]          coherent system.” - [Tanenbaum]
Characteristics and Challenges
• No Global Clock                                        • Fault
                                                           Tolerance
• Communication
                                                         • Scale
  only by message                                        • Transparenc
  Passing
• No Global State
• Independent
  Failures




                                             Photo by John Trainoron Flickr
                         http://www.flickr.com/photos/trainor/2902023575/, Licensed u
Fallacies of Distributed Systems




•   The network is reliable.                  • There is one
•   Latency is zero.                            administrator.
•   Bandwidth is infinite.                    • Transport cost is zero.
•   The network is secure.                    • The network is
•   Topology doesn't change.                    homogeneous.
                      http://www.flickr.com/photos/12587661@N06/2300406685, @Michael Gwyther-Jones, L
Why Distributed Systems
•   Need to build bigger systems
•   Many usecases are inherently distributed
•   To avoid failures
•   Omnipresence
    –   if you buy food from a super market
    –   If you buy a book from a Bookshop Chain
    –   If you search in the Web
    –   If you use a GPS navigator
    –   If you turn on your My 10 list
    –   If you pay a bill
    –   If you use your mobile App
A System Usecase Classification
• Processing Data
  (Moving vs. Stored
  Data)
• Servers: Receive,
  Process, and Respond
• Running User provided
  Jobs
• Data Storages and
  Provenance

                          http://www.flickr.com/photos/kelsea-groves/5535666329/
Usecase: Processing Data: React to Sensors




 • Many sensors: Weather, Travel, Traffic, Surveillance, Stock
   exchange, Smart Grid, Production line
 • Monitor, understand, and react to events
 • Usually handled with CEP (e.g. Esper, Stream Base, Siddhi) or
   Stream Processing (S4, Twitter Stream)
                                  http://www.flickr.com/photos/imuttoo/4257813689/ by Ian
Muttoo, http://www.flickr.com/photos/eastcapital/4554220770/, http://www.flickr.com/photos/patdavid/4619331472/ by Pat David
                                                         copyright CC
Usecase: Processing Data: Target Marketing




• Receive data about users continuously: e.g. web
  clicks, what they brought, what they liked and do not
  like, what their friends like and brought
• Build models, index information in the background
• Send him advertisements that best matches his
  preferences
   – have to do this quickly
   – in few (say 50) milliseconds
• Cloud be the next billion dollar problem
Usecase: Receive, Process, and Respond:
          Online Store (e.g. Amazon)
                                       • Many Sellers selling
                                         many items and
                                         Many Byers
                                       • List of all items,
                                         with their specs
                                       • Index items by
                                         many dimensions
                                         and support search

• Support checkout, track the delivery, returns, ratings, and
  complains
• Supported by partitioning sellers/ items across many nodes
Usecase: Running User Provided Jobs :
            SETI@Home
• Many people volunteer
  their computing power
• Scientists submit
  computing jobs to the
  system
• Broker and match
  resources with jobs, run
  them and return results.
  Handle failures. Avoid
  free riding.
• Considered biggest
  computer in earth (505
  TFLOPS, 150k active
  computers)
            http://www.elfwood.com/~axthony/Staring-Aliens.2552052.html, Licensed CC
Usecase: Data Storages and Provenance
             (Sky Server)
                                                       • Telescopes (Square Kilometer
                                                         Array) keep collecting data from
                                                         the sky (Tera bytes per day)
                                                       • Sky Server let scientists to come
                                                         and see the sky of a given
                                                         location, as seen at a given
                                                         time.
                                                       • Moving data takes long time.
                                                         1TB takes
                                                           – 100 Mbps network : 30 hrs
                                                           – 1 Gbps network : 3 hrs
                                                           – 10 Gbps network : 20 minutes
                                                       • Given a data item, need to track
                                                         how it is created, equipment
                                                         accuracy, transformations used
 http://www.fotopedia.com/items/flickr-518876976 and
                                                         etc.
http://www.geograph.org.uk/photo/103069, Licensed CC
Mobile Sensor Crowdsourcing
                                                                     • Mobile phones are now like a
                                                                       weather center: has
                                                                        –   a barometer
                                                                        –   temperature sensor
                                                                        –   proximity sensor
                                                                        –   GPS
                                                                        –   moisture sensor
                                                                     • Get volunteer phones to send
                                                                       sensor data (Crowd source).
                                                                        – report on weather
                                                                        – crop diseases (agriculture
                                                                          officials)
                                                                        – epidemics (from hospitals,
                                                                          doctors)
                                                                     • Use that to do weather
                                                                       predications, crop disease and
         http://www.fotopedia.com/items/flickr-2548697541 ,            epidemic spread
          http://www.geograph.org.uk/photo/1534209, and
http://www.yourbdnews.com/2011/10/17/samsung-files-to-halt-iphone-   • Moving Sensors (Polar Grid)
              4s-in-japan-australia/iphone-4s, Licensed CC
Great! lets see what
     Distributed System
technologies have made these
     use cases possible!!
Distributed Systems Timeline/History
Period          Topics


1965-late 70s   Parallel Programming, Self Stabilization, Fault Tolerance, ER Model/
                Transactions, Time Clock
1980s           Consensus and impossibility, SQL, Distributed Snapshots,
                Replications, Group Communication


Early 90s       Linearizability, Parallel DB, transactional Memory, RAID, MPI


Late 90s        Volunteer Computing, P2P file sharing, Complex event processing


Early 2000      Oceanostore, Web Services, Symantec Web, REST, DHT, Pub/Sub,
                Grid, Autonomic Computing, Google File System, Virtualization, SOA,
                Map reduce
2005-2010       Cloud, NoSQL, Mobile Apps, Data Provenance
Theoretical Computer Science
                                      • Concerns with
                                             – Coordination algorithms:
                                               Leader Election, multi-cast,
                                               distributed locks, barriers,
                                               snapshot algorithms
                                             – Impossibility results, upper
                                               and lower bounds
                                             – Distributed versions of some
                                               centralized algorithms (e.g.
                                               shortest path)
                                             – Lot of work done on 70s,
                                               and layed the ground work
                                               for Distributed Systems
          http://www.flickr.com/photos/lodz_na_nowo/5690492370/
                             http://xkcd.com/384/
 http://www.flickr.com/photos/quinnanya/4990131194/sizes/z/in/photostream/
                                 , Licensed CC
Communication Protocols
• Request/Response
  – RMI, CORBA, REST/HTTP,
    WS, Thrift
• Publish/Subscribe
• Distributed Queues
• DHT (Distributed Hash
  Tables)
• Gossip/ Epidemic
  Protocols
• Whiteboards             http://www.flickr.com/photos/novecentino/2596898279/, Licensed CC
Request/Response and Architectural
              Styles
• Message formats
  • RMI, CORBA, REST/HTTP, Web Service, Thrift
• Architectural Styles
  – Remote Procedure Calls (RPC)
  – Distributed Objects
  – Service Oriented Architecture (SOA)
  – Resource Oriented Architecture (ROA)
Known Distributed Architecture
               Patterns
• LB + Shared nothing Nodes
• LB + Stateless Nodes + Scalable
  Storage
• DHT (Distributed Hash Table)
• Distributed queues
• Publish/Subscribe Broker
  Network
• Gossip architectures + biology
  inspired algorithms
• Map reduce/ data flows
• Stream processing
• Tree of responsibility
LB + Shared Nothing and 3-Tier




• Most common scaling pattern
• Most architectures follows this model
Storages
• Single Database
• Replicated Databases
• Parallel Databases
  (Sharding)
• NewSQL (In-
  Memory, sharding .. Highly
  optimized)
• NoSQL (Column Family, Key
  Value pair, Document)
Building Scalable Systems
• Single Machine
• Shared Memory
  Model
• Clustering (State
  Replication
  through group
  communication)
• Shard Nothing
• Loose Consistency
  with Shared
  nothing             http://www.fotopedia.com/items/louromig-8P4w6xtSgbY, Licensed CC
Publish Subscribe and EDA




• Many publishers send events
• Subscribers register events, and a
  publish/subscribe network match and redirect
  events
• Have scalable implementations
• Basis for event driven architectures
Cloud Computing
• Ability to buy computations
  power, storage, or execution
  services as an Utility, on demand.
• Best way to explain it is by
  comparing it to Electricity
• Idea is a big pool of servers and
  share.
  • Economics of scale through Optimize
    large scale operations.
  • Resource Pooling.
  • No need for capacity planning, start
    small and grow as needed.
  • Outsource and enabling
    specialization.
                                             photo by LoopZilla on Flickr,
                       http://www.flickr.com/photos/loopzilla/2328231843/sizes/m/in/photostre
Where do go from here?
If You Plan to Learn about Distributed
                   Systems
• One of the fields to learn by
  doing
• You have to be a good
  programmer
   – a patient one (Debugging)
   – Lazy one (but intelligent)
• Start by writing some Web
  Services, request response stuff
• Stop reinventing the wheel, start
  using tools (middleware)
• Learn Zookeeper
• Take a class – read, write code,
  debug, ..
                                      http://www.flickr.com/photos/mariachily/5250487136,
                                                           Licensed CC
Distributed System Community
•   Based around ACM, IEEE, and USENIX
•   Well known journals
     – IBM System journal, ACM Operating Systems Review,
        ACM Transactions on Computer Systems, IEEE
        Distributed Systems Online, IEEE Transactions on
        Parallel and Distributed Systems
•   Conferences
     – Theory: ICDCS, SPDC
     – SOA/Cloud : ICWS
     – E-Science, Parallel Programming : HPDC, SC, E-
        Science, Ccgrid
     – Systems : USENIX, Middleware, ACM Symposium on
        Operating Systems Principles, FAST, LISA, OSDI
     – DB : Sigmoid record, VLDB
•   Awards
     – Turing Award
     – Edsger W. Dijkstra Prize in Distributed Computing
      http://www.flickr.com/photos/dullhunk/4187914071, http://www.foto
               pedia.com/items/flickr-1544709148, Licensed CC
Few Must Read Papers
•   System Structure for Software Fault Tolerance (1975)
•   Reaching Agreement in the Presence of Faults (1980)
•   Time, Clocks, and the Ordering of Events in a Distributed System (1978)
•   Reaching agreement in the presence of faults(1980) and The Byzantine
    generals problem” (1982),
•   End-to-End Arguments in System Design (1984)
•   A Note on Distributed Computing (1994)
•   Scale in Distributed Systems, (1994)
•   The Google File System (2003)
•   Chord: A Scalable Peer-to-peer Lookup Service for Internet Applications,
    (2001)
•   The Google file system (2003)
•   Xen and the Art of virtualization (2003)
•   MapReduce: Simplified Data Processing on Large Clusters (2004)
Some Open Challenges
• Every thing Data: Analytics, AI,
  Data Mining (Distributed
  versions of many algorithms)
• Complex Event Processing
  (CEP)
• How to Scale?
• Middleware for the Cloud
• Scalable Storage
• Provenance
• Workflows
• Guard against DDoS and other       http://www.flickr.com/photos/brianscott/5474210001,
  Distributed Security Issues                             Licensed CC
Questions?




Copyright by romainguy, and licensed for reuse under CC License
    http://www.flickr.com/photos/romainguy/249370084

Contenu connexe

Similaire à Keynote for CSE conference 2011: Distributed Systems: What? Why? And bit of How?

ERU-2-wsn.ppt
ERU-2-wsn.pptERU-2-wsn.ppt
ERU-2-wsn.pptSahanaMk2
 
wirelss sensor network
wirelss sensor networkwirelss sensor network
wirelss sensor networkrasyidi usman
 
Data, Big Data and real time analytics for Connected Devices
Data, Big Data and real time analytics for Connected DevicesData, Big Data and real time analytics for Connected Devices
Data, Big Data and real time analytics for Connected DevicesSrinath Perera
 
From gigapixel timelapse cameras to unmanned aerial vehicles to smartphones: ...
From gigapixel timelapse cameras to unmanned aerial vehicles to smartphones: ...From gigapixel timelapse cameras to unmanned aerial vehicles to smartphones: ...
From gigapixel timelapse cameras to unmanned aerial vehicles to smartphones: ...TimeScience
 
The ways in which ict is used
The ways in which ict is usedThe ways in which ict is used
The ways in which ict is usedgracepm28
 
Seismic sensor
Seismic sensorSeismic sensor
Seismic sensorajsatienza
 
UNIT I DIS.pptx
UNIT I DIS.pptxUNIT I DIS.pptx
UNIT I DIS.pptxSamPrem3
 
An Architecture for Privacy-Sensitive Ubiquitous Computing at Mobisys 2004
An Architecture for Privacy-Sensitive Ubiquitous Computing at Mobisys 2004An Architecture for Privacy-Sensitive Ubiquitous Computing at Mobisys 2004
An Architecture for Privacy-Sensitive Ubiquitous Computing at Mobisys 2004Jason Hong
 
Sensor Data in Business
Sensor Data in BusinessSensor Data in Business
Sensor Data in BusinessNiko Vuokko
 
Real time visualization of structured things
Real time visualization of structured thingsReal time visualization of structured things
Real time visualization of structured thingsNurul Amin Choudhury
 
MC Lecture 9234455566667777777777777.pptx
MC Lecture 9234455566667777777777777.pptxMC Lecture 9234455566667777777777777.pptx
MC Lecture 9234455566667777777777777.pptxBinyamBekeleMoges
 
From Context-awareness to Human Behavior Patterns
From Context-awareness to Human Behavior PatternsFrom Context-awareness to Human Behavior Patterns
From Context-awareness to Human Behavior PatternsVille Antila
 
Collecting big data in cinemas to improve recommendation systems - a model wi...
Collecting big data in cinemas to improve recommendation systems - a model wi...Collecting big data in cinemas to improve recommendation systems - a model wi...
Collecting big data in cinemas to improve recommendation systems - a model wi...ICDEcCnferenece
 
III CSE IoT Unit - I.pptx
III CSE IoT Unit - I.pptxIII CSE IoT Unit - I.pptx
III CSE IoT Unit - I.pptxAvinashAvuthu2
 
Citron : Context Information Acquisition Framework on Personal Devices
Citron : Context Information Acquisition Framework on Personal DevicesCitron : Context Information Acquisition Framework on Personal Devices
Citron : Context Information Acquisition Framework on Personal DevicesTetsuo Yamabe
 
Real-time, Sensor-based Monitoring of Shipping Containers
Real-time, Sensor-based Monitoring of Shipping ContainersReal-time, Sensor-based Monitoring of Shipping Containers
Real-time, Sensor-based Monitoring of Shipping Containersbenaam
 
Artificial intelligence (AI) + Sensors + Aeronautics
Artificial intelligence (AI) + Sensors + AeronauticsArtificial intelligence (AI) + Sensors + Aeronautics
Artificial intelligence (AI) + Sensors + Aeronauticswaleed zahid kayani
 

Similaire à Keynote for CSE conference 2011: Distributed Systems: What? Why? And bit of How? (20)

ERU-2-wsn.ppt
ERU-2-wsn.pptERU-2-wsn.ppt
ERU-2-wsn.ppt
 
ERU-2-wsn.ppt
ERU-2-wsn.pptERU-2-wsn.ppt
ERU-2-wsn.ppt
 
wirelss sensor network
wirelss sensor networkwirelss sensor network
wirelss sensor network
 
Data, Big Data and real time analytics for Connected Devices
Data, Big Data and real time analytics for Connected DevicesData, Big Data and real time analytics for Connected Devices
Data, Big Data and real time analytics for Connected Devices
 
From gigapixel timelapse cameras to unmanned aerial vehicles to smartphones: ...
From gigapixel timelapse cameras to unmanned aerial vehicles to smartphones: ...From gigapixel timelapse cameras to unmanned aerial vehicles to smartphones: ...
From gigapixel timelapse cameras to unmanned aerial vehicles to smartphones: ...
 
The ways in which ict is used
The ways in which ict is usedThe ways in which ict is used
The ways in which ict is used
 
Seismic sensor
Seismic sensorSeismic sensor
Seismic sensor
 
UNIT I DIS.pptx
UNIT I DIS.pptxUNIT I DIS.pptx
UNIT I DIS.pptx
 
An Architecture for Privacy-Sensitive Ubiquitous Computing at Mobisys 2004
An Architecture for Privacy-Sensitive Ubiquitous Computing at Mobisys 2004An Architecture for Privacy-Sensitive Ubiquitous Computing at Mobisys 2004
An Architecture for Privacy-Sensitive Ubiquitous Computing at Mobisys 2004
 
Sensor Data in Business
Sensor Data in BusinessSensor Data in Business
Sensor Data in Business
 
Lecture3 - VR Technology
Lecture3 - VR TechnologyLecture3 - VR Technology
Lecture3 - VR Technology
 
Real time visualization of structured things
Real time visualization of structured thingsReal time visualization of structured things
Real time visualization of structured things
 
MC Lecture 9234455566667777777777777.pptx
MC Lecture 9234455566667777777777777.pptxMC Lecture 9234455566667777777777777.pptx
MC Lecture 9234455566667777777777777.pptx
 
From Context-awareness to Human Behavior Patterns
From Context-awareness to Human Behavior PatternsFrom Context-awareness to Human Behavior Patterns
From Context-awareness to Human Behavior Patterns
 
Collecting big data in cinemas to improve recommendation systems - a model wi...
Collecting big data in cinemas to improve recommendation systems - a model wi...Collecting big data in cinemas to improve recommendation systems - a model wi...
Collecting big data in cinemas to improve recommendation systems - a model wi...
 
III CSE IoT Unit - I.pptx
III CSE IoT Unit - I.pptxIII CSE IoT Unit - I.pptx
III CSE IoT Unit - I.pptx
 
Citron : Context Information Acquisition Framework on Personal Devices
Citron : Context Information Acquisition Framework on Personal DevicesCitron : Context Information Acquisition Framework on Personal Devices
Citron : Context Information Acquisition Framework on Personal Devices
 
slide-171212080528.pptx
slide-171212080528.pptxslide-171212080528.pptx
slide-171212080528.pptx
 
Real-time, Sensor-based Monitoring of Shipping Containers
Real-time, Sensor-based Monitoring of Shipping ContainersReal-time, Sensor-based Monitoring of Shipping Containers
Real-time, Sensor-based Monitoring of Shipping Containers
 
Artificial intelligence (AI) + Sensors + Aeronautics
Artificial intelligence (AI) + Sensors + AeronauticsArtificial intelligence (AI) + Sensors + Aeronautics
Artificial intelligence (AI) + Sensors + Aeronautics
 

Plus de Srinath Perera

Book: Software Architecture and Decision-Making
Book: Software Architecture and Decision-MakingBook: Software Architecture and Decision-Making
Book: Software Architecture and Decision-MakingSrinath Perera
 
Data science Applications in the Enterprise
Data science Applications in the EnterpriseData science Applications in the Enterprise
Data science Applications in the EnterpriseSrinath Perera
 
An Introduction to APIs
An Introduction to APIs An Introduction to APIs
An Introduction to APIs Srinath Perera
 
An Introduction to Blockchain for Finance Professionals
An Introduction to Blockchain for Finance ProfessionalsAn Introduction to Blockchain for Finance Professionals
An Introduction to Blockchain for Finance ProfessionalsSrinath Perera
 
AI in the Real World: Challenges, and Risks and how to handle them?
AI in the Real World: Challenges, and Risks and how to handle them?AI in the Real World: Challenges, and Risks and how to handle them?
AI in the Real World: Challenges, and Risks and how to handle them?Srinath Perera
 
Healthcare + AI: Use cases & Challenges
Healthcare + AI: Use cases & ChallengesHealthcare + AI: Use cases & Challenges
Healthcare + AI: Use cases & ChallengesSrinath Perera
 
How would AI shape Future Integrations?
How would AI shape Future Integrations?How would AI shape Future Integrations?
How would AI shape Future Integrations?Srinath Perera
 
The Role of Blockchain in Future Integrations
The Role of Blockchain in Future IntegrationsThe Role of Blockchain in Future Integrations
The Role of Blockchain in Future IntegrationsSrinath Perera
 
Blockchain: Where are we? Where are we going?
Blockchain: Where are we? Where are we going? Blockchain: Where are we? Where are we going?
Blockchain: Where are we? Where are we going? Srinath Perera
 
Few thoughts about Future of Blockchain
Few thoughts about Future of BlockchainFew thoughts about Future of Blockchain
Few thoughts about Future of BlockchainSrinath Perera
 
A Visual Canvas for Judging New Technologies
A Visual Canvas for Judging New TechnologiesA Visual Canvas for Judging New Technologies
A Visual Canvas for Judging New TechnologiesSrinath Perera
 
Privacy in Bigdata Era
Privacy in Bigdata  EraPrivacy in Bigdata  Era
Privacy in Bigdata EraSrinath Perera
 
Blockchain, Impact, Challenges, and Risks
Blockchain, Impact, Challenges, and RisksBlockchain, Impact, Challenges, and Risks
Blockchain, Impact, Challenges, and RisksSrinath Perera
 
Today's Technology and Emerging Technology Landscape
Today's Technology and Emerging Technology LandscapeToday's Technology and Emerging Technology Landscape
Today's Technology and Emerging Technology LandscapeSrinath Perera
 
An Emerging Technologies Timeline
An Emerging Technologies TimelineAn Emerging Technologies Timeline
An Emerging Technologies TimelineSrinath Perera
 
The Rise of Streaming SQL and Evolution of Streaming Applications
The Rise of Streaming SQL and Evolution of Streaming ApplicationsThe Rise of Streaming SQL and Evolution of Streaming Applications
The Rise of Streaming SQL and Evolution of Streaming ApplicationsSrinath Perera
 
Analytics and AI: The Good, the Bad and the Ugly
Analytics and AI: The Good, the Bad and the UglyAnalytics and AI: The Good, the Bad and the Ugly
Analytics and AI: The Good, the Bad and the UglySrinath Perera
 
Transforming a Business Through Analytics
Transforming a Business Through AnalyticsTransforming a Business Through Analytics
Transforming a Business Through AnalyticsSrinath Perera
 
SoC Keynote:The State of the Art in Integration Technology
SoC Keynote:The State of the Art in Integration TechnologySoC Keynote:The State of the Art in Integration Technology
SoC Keynote:The State of the Art in Integration TechnologySrinath Perera
 

Plus de Srinath Perera (20)

Book: Software Architecture and Decision-Making
Book: Software Architecture and Decision-MakingBook: Software Architecture and Decision-Making
Book: Software Architecture and Decision-Making
 
Data science Applications in the Enterprise
Data science Applications in the EnterpriseData science Applications in the Enterprise
Data science Applications in the Enterprise
 
An Introduction to APIs
An Introduction to APIs An Introduction to APIs
An Introduction to APIs
 
An Introduction to Blockchain for Finance Professionals
An Introduction to Blockchain for Finance ProfessionalsAn Introduction to Blockchain for Finance Professionals
An Introduction to Blockchain for Finance Professionals
 
AI in the Real World: Challenges, and Risks and how to handle them?
AI in the Real World: Challenges, and Risks and how to handle them?AI in the Real World: Challenges, and Risks and how to handle them?
AI in the Real World: Challenges, and Risks and how to handle them?
 
Healthcare + AI: Use cases & Challenges
Healthcare + AI: Use cases & ChallengesHealthcare + AI: Use cases & Challenges
Healthcare + AI: Use cases & Challenges
 
How would AI shape Future Integrations?
How would AI shape Future Integrations?How would AI shape Future Integrations?
How would AI shape Future Integrations?
 
The Role of Blockchain in Future Integrations
The Role of Blockchain in Future IntegrationsThe Role of Blockchain in Future Integrations
The Role of Blockchain in Future Integrations
 
Future of Serverless
Future of ServerlessFuture of Serverless
Future of Serverless
 
Blockchain: Where are we? Where are we going?
Blockchain: Where are we? Where are we going? Blockchain: Where are we? Where are we going?
Blockchain: Where are we? Where are we going?
 
Few thoughts about Future of Blockchain
Few thoughts about Future of BlockchainFew thoughts about Future of Blockchain
Few thoughts about Future of Blockchain
 
A Visual Canvas for Judging New Technologies
A Visual Canvas for Judging New TechnologiesA Visual Canvas for Judging New Technologies
A Visual Canvas for Judging New Technologies
 
Privacy in Bigdata Era
Privacy in Bigdata  EraPrivacy in Bigdata  Era
Privacy in Bigdata Era
 
Blockchain, Impact, Challenges, and Risks
Blockchain, Impact, Challenges, and RisksBlockchain, Impact, Challenges, and Risks
Blockchain, Impact, Challenges, and Risks
 
Today's Technology and Emerging Technology Landscape
Today's Technology and Emerging Technology LandscapeToday's Technology and Emerging Technology Landscape
Today's Technology and Emerging Technology Landscape
 
An Emerging Technologies Timeline
An Emerging Technologies TimelineAn Emerging Technologies Timeline
An Emerging Technologies Timeline
 
The Rise of Streaming SQL and Evolution of Streaming Applications
The Rise of Streaming SQL and Evolution of Streaming ApplicationsThe Rise of Streaming SQL and Evolution of Streaming Applications
The Rise of Streaming SQL and Evolution of Streaming Applications
 
Analytics and AI: The Good, the Bad and the Ugly
Analytics and AI: The Good, the Bad and the UglyAnalytics and AI: The Good, the Bad and the Ugly
Analytics and AI: The Good, the Bad and the Ugly
 
Transforming a Business Through Analytics
Transforming a Business Through AnalyticsTransforming a Business Through Analytics
Transforming a Business Through Analytics
 
SoC Keynote:The State of the Art in Integration Technology
SoC Keynote:The State of the Art in Integration TechnologySoC Keynote:The State of the Art in Integration Technology
SoC Keynote:The State of the Art in Integration Technology
 

Dernier

Moving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdfMoving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdfLoriGlavin3
 
Manual 508 Accessibility Compliance Audit
Manual 508 Accessibility Compliance AuditManual 508 Accessibility Compliance Audit
Manual 508 Accessibility Compliance AuditSkynet Technologies
 
A Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptxA Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptxLoriGlavin3
 
TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024Lonnie McRorey
 
Testing tools and AI - ideas what to try with some tool examples
Testing tools and AI - ideas what to try with some tool examplesTesting tools and AI - ideas what to try with some tool examples
Testing tools and AI - ideas what to try with some tool examplesKari Kakkonen
 
Generative AI for Technical Writer or Information Developers
Generative AI for Technical Writer or Information DevelopersGenerative AI for Technical Writer or Information Developers
Generative AI for Technical Writer or Information DevelopersRaghuram Pandurangan
 
UiPath Community: Communication Mining from Zero to Hero
UiPath Community: Communication Mining from Zero to HeroUiPath Community: Communication Mining from Zero to Hero
UiPath Community: Communication Mining from Zero to HeroUiPathCommunity
 
A Journey Into the Emotions of Software Developers
A Journey Into the Emotions of Software DevelopersA Journey Into the Emotions of Software Developers
A Journey Into the Emotions of Software DevelopersNicole Novielli
 
What is DBT - The Ultimate Data Build Tool.pdf
What is DBT - The Ultimate Data Build Tool.pdfWhat is DBT - The Ultimate Data Build Tool.pdf
What is DBT - The Ultimate Data Build Tool.pdfMounikaPolabathina
 
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptxPasskey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptxLoriGlavin3
 
Potential of AI (Generative AI) in Business: Learnings and Insights
Potential of AI (Generative AI) in Business: Learnings and InsightsPotential of AI (Generative AI) in Business: Learnings and Insights
Potential of AI (Generative AI) in Business: Learnings and InsightsRavi Sanghani
 
The Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsThe Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsPixlogix Infotech
 
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxMerck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxLoriGlavin3
 
So einfach geht modernes Roaming fuer Notes und Nomad.pdf
So einfach geht modernes Roaming fuer Notes und Nomad.pdfSo einfach geht modernes Roaming fuer Notes und Nomad.pdf
So einfach geht modernes Roaming fuer Notes und Nomad.pdfpanagenda
 
Sample pptx for embedding into website for demo
Sample pptx for embedding into website for demoSample pptx for embedding into website for demo
Sample pptx for embedding into website for demoHarshalMandlekar2
 
How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.Curtis Poe
 
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024BookNet Canada
 
Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24
Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24
Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24Mark Goldstein
 
Enhancing User Experience - Exploring the Latest Features of Tallyman Axis Lo...
Enhancing User Experience - Exploring the Latest Features of Tallyman Axis Lo...Enhancing User Experience - Exploring the Latest Features of Tallyman Axis Lo...
Enhancing User Experience - Exploring the Latest Features of Tallyman Axis Lo...Scott Andery
 
Rise of the Machines: Known As Drones...
Rise of the Machines: Known As Drones...Rise of the Machines: Known As Drones...
Rise of the Machines: Known As Drones...Rick Flair
 

Dernier (20)

Moving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdfMoving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdf
 
Manual 508 Accessibility Compliance Audit
Manual 508 Accessibility Compliance AuditManual 508 Accessibility Compliance Audit
Manual 508 Accessibility Compliance Audit
 
A Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptxA Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptx
 
TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024
 
Testing tools and AI - ideas what to try with some tool examples
Testing tools and AI - ideas what to try with some tool examplesTesting tools and AI - ideas what to try with some tool examples
Testing tools and AI - ideas what to try with some tool examples
 
Generative AI for Technical Writer or Information Developers
Generative AI for Technical Writer or Information DevelopersGenerative AI for Technical Writer or Information Developers
Generative AI for Technical Writer or Information Developers
 
UiPath Community: Communication Mining from Zero to Hero
UiPath Community: Communication Mining from Zero to HeroUiPath Community: Communication Mining from Zero to Hero
UiPath Community: Communication Mining from Zero to Hero
 
A Journey Into the Emotions of Software Developers
A Journey Into the Emotions of Software DevelopersA Journey Into the Emotions of Software Developers
A Journey Into the Emotions of Software Developers
 
What is DBT - The Ultimate Data Build Tool.pdf
What is DBT - The Ultimate Data Build Tool.pdfWhat is DBT - The Ultimate Data Build Tool.pdf
What is DBT - The Ultimate Data Build Tool.pdf
 
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptxPasskey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptx
 
Potential of AI (Generative AI) in Business: Learnings and Insights
Potential of AI (Generative AI) in Business: Learnings and InsightsPotential of AI (Generative AI) in Business: Learnings and Insights
Potential of AI (Generative AI) in Business: Learnings and Insights
 
The Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsThe Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and Cons
 
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxMerck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
 
So einfach geht modernes Roaming fuer Notes und Nomad.pdf
So einfach geht modernes Roaming fuer Notes und Nomad.pdfSo einfach geht modernes Roaming fuer Notes und Nomad.pdf
So einfach geht modernes Roaming fuer Notes und Nomad.pdf
 
Sample pptx for embedding into website for demo
Sample pptx for embedding into website for demoSample pptx for embedding into website for demo
Sample pptx for embedding into website for demo
 
How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.
 
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
 
Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24
Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24
Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24
 
Enhancing User Experience - Exploring the Latest Features of Tallyman Axis Lo...
Enhancing User Experience - Exploring the Latest Features of Tallyman Axis Lo...Enhancing User Experience - Exploring the Latest Features of Tallyman Axis Lo...
Enhancing User Experience - Exploring the Latest Features of Tallyman Axis Lo...
 
Rise of the Machines: Known As Drones...
Rise of the Machines: Known As Drones...Rise of the Machines: Known As Drones...
Rise of the Machines: Known As Drones...
 

Keynote for CSE conference 2011: Distributed Systems: What? Why? And bit of How?

  • 1.
  • 2. I cannot cover Distributed Systems in 30 minutes! But, I can tell why you might want to learn Distributed Systems in 30 minutes! http://www.flickr.com/photos/uwehermann/82753155/sizes/m/in/photostream/ and http://www.flickr.com/photos/peterpearson/5921765552, licensed under CC
  • 3. What is a Distributed System? "A distributed system is one on which I cannot get any work done because some machine I have never heard of has crashed.“ --Leslie Lamport
  • 4. What is a Distributed System? “A system in which hardware or “A distributed system is a software components located collection of independent at networked computers communicate and coordinate computers that appear to the their actions only by message users of the system as a single passing.” - [Coulouris] coherent system.” - [Tanenbaum]
  • 5. Characteristics and Challenges • No Global Clock • Fault Tolerance • Communication • Scale only by message • Transparenc Passing • No Global State • Independent Failures Photo by John Trainoron Flickr http://www.flickr.com/photos/trainor/2902023575/, Licensed u
  • 6. Fallacies of Distributed Systems • The network is reliable. • There is one • Latency is zero. administrator. • Bandwidth is infinite. • Transport cost is zero. • The network is secure. • The network is • Topology doesn't change. homogeneous. http://www.flickr.com/photos/12587661@N06/2300406685, @Michael Gwyther-Jones, L
  • 7. Why Distributed Systems • Need to build bigger systems • Many usecases are inherently distributed • To avoid failures • Omnipresence – if you buy food from a super market – If you buy a book from a Bookshop Chain – If you search in the Web – If you use a GPS navigator – If you turn on your My 10 list – If you pay a bill – If you use your mobile App
  • 8. A System Usecase Classification • Processing Data (Moving vs. Stored Data) • Servers: Receive, Process, and Respond • Running User provided Jobs • Data Storages and Provenance http://www.flickr.com/photos/kelsea-groves/5535666329/
  • 9. Usecase: Processing Data: React to Sensors • Many sensors: Weather, Travel, Traffic, Surveillance, Stock exchange, Smart Grid, Production line • Monitor, understand, and react to events • Usually handled with CEP (e.g. Esper, Stream Base, Siddhi) or Stream Processing (S4, Twitter Stream) http://www.flickr.com/photos/imuttoo/4257813689/ by Ian Muttoo, http://www.flickr.com/photos/eastcapital/4554220770/, http://www.flickr.com/photos/patdavid/4619331472/ by Pat David copyright CC
  • 10. Usecase: Processing Data: Target Marketing • Receive data about users continuously: e.g. web clicks, what they brought, what they liked and do not like, what their friends like and brought • Build models, index information in the background • Send him advertisements that best matches his preferences – have to do this quickly – in few (say 50) milliseconds • Cloud be the next billion dollar problem
  • 11. Usecase: Receive, Process, and Respond: Online Store (e.g. Amazon) • Many Sellers selling many items and Many Byers • List of all items, with their specs • Index items by many dimensions and support search • Support checkout, track the delivery, returns, ratings, and complains • Supported by partitioning sellers/ items across many nodes
  • 12. Usecase: Running User Provided Jobs : SETI@Home • Many people volunteer their computing power • Scientists submit computing jobs to the system • Broker and match resources with jobs, run them and return results. Handle failures. Avoid free riding. • Considered biggest computer in earth (505 TFLOPS, 150k active computers) http://www.elfwood.com/~axthony/Staring-Aliens.2552052.html, Licensed CC
  • 13. Usecase: Data Storages and Provenance (Sky Server) • Telescopes (Square Kilometer Array) keep collecting data from the sky (Tera bytes per day) • Sky Server let scientists to come and see the sky of a given location, as seen at a given time. • Moving data takes long time. 1TB takes – 100 Mbps network : 30 hrs – 1 Gbps network : 3 hrs – 10 Gbps network : 20 minutes • Given a data item, need to track how it is created, equipment accuracy, transformations used http://www.fotopedia.com/items/flickr-518876976 and etc. http://www.geograph.org.uk/photo/103069, Licensed CC
  • 14. Mobile Sensor Crowdsourcing • Mobile phones are now like a weather center: has – a barometer – temperature sensor – proximity sensor – GPS – moisture sensor • Get volunteer phones to send sensor data (Crowd source). – report on weather – crop diseases (agriculture officials) – epidemics (from hospitals, doctors) • Use that to do weather predications, crop disease and http://www.fotopedia.com/items/flickr-2548697541 , epidemic spread http://www.geograph.org.uk/photo/1534209, and http://www.yourbdnews.com/2011/10/17/samsung-files-to-halt-iphone- • Moving Sensors (Polar Grid) 4s-in-japan-australia/iphone-4s, Licensed CC
  • 15. Great! lets see what Distributed System technologies have made these use cases possible!!
  • 16. Distributed Systems Timeline/History Period Topics 1965-late 70s Parallel Programming, Self Stabilization, Fault Tolerance, ER Model/ Transactions, Time Clock 1980s Consensus and impossibility, SQL, Distributed Snapshots, Replications, Group Communication Early 90s Linearizability, Parallel DB, transactional Memory, RAID, MPI Late 90s Volunteer Computing, P2P file sharing, Complex event processing Early 2000 Oceanostore, Web Services, Symantec Web, REST, DHT, Pub/Sub, Grid, Autonomic Computing, Google File System, Virtualization, SOA, Map reduce 2005-2010 Cloud, NoSQL, Mobile Apps, Data Provenance
  • 17. Theoretical Computer Science • Concerns with – Coordination algorithms: Leader Election, multi-cast, distributed locks, barriers, snapshot algorithms – Impossibility results, upper and lower bounds – Distributed versions of some centralized algorithms (e.g. shortest path) – Lot of work done on 70s, and layed the ground work for Distributed Systems http://www.flickr.com/photos/lodz_na_nowo/5690492370/ http://xkcd.com/384/ http://www.flickr.com/photos/quinnanya/4990131194/sizes/z/in/photostream/ , Licensed CC
  • 18. Communication Protocols • Request/Response – RMI, CORBA, REST/HTTP, WS, Thrift • Publish/Subscribe • Distributed Queues • DHT (Distributed Hash Tables) • Gossip/ Epidemic Protocols • Whiteboards http://www.flickr.com/photos/novecentino/2596898279/, Licensed CC
  • 19. Request/Response and Architectural Styles • Message formats • RMI, CORBA, REST/HTTP, Web Service, Thrift • Architectural Styles – Remote Procedure Calls (RPC) – Distributed Objects – Service Oriented Architecture (SOA) – Resource Oriented Architecture (ROA)
  • 20. Known Distributed Architecture Patterns • LB + Shared nothing Nodes • LB + Stateless Nodes + Scalable Storage • DHT (Distributed Hash Table) • Distributed queues • Publish/Subscribe Broker Network • Gossip architectures + biology inspired algorithms • Map reduce/ data flows • Stream processing • Tree of responsibility
  • 21. LB + Shared Nothing and 3-Tier • Most common scaling pattern • Most architectures follows this model
  • 22. Storages • Single Database • Replicated Databases • Parallel Databases (Sharding) • NewSQL (In- Memory, sharding .. Highly optimized) • NoSQL (Column Family, Key Value pair, Document)
  • 23. Building Scalable Systems • Single Machine • Shared Memory Model • Clustering (State Replication through group communication) • Shard Nothing • Loose Consistency with Shared nothing http://www.fotopedia.com/items/louromig-8P4w6xtSgbY, Licensed CC
  • 24. Publish Subscribe and EDA • Many publishers send events • Subscribers register events, and a publish/subscribe network match and redirect events • Have scalable implementations • Basis for event driven architectures
  • 25. Cloud Computing • Ability to buy computations power, storage, or execution services as an Utility, on demand. • Best way to explain it is by comparing it to Electricity • Idea is a big pool of servers and share. • Economics of scale through Optimize large scale operations. • Resource Pooling. • No need for capacity planning, start small and grow as needed. • Outsource and enabling specialization. photo by LoopZilla on Flickr, http://www.flickr.com/photos/loopzilla/2328231843/sizes/m/in/photostre
  • 26. Where do go from here?
  • 27. If You Plan to Learn about Distributed Systems • One of the fields to learn by doing • You have to be a good programmer – a patient one (Debugging) – Lazy one (but intelligent) • Start by writing some Web Services, request response stuff • Stop reinventing the wheel, start using tools (middleware) • Learn Zookeeper • Take a class – read, write code, debug, .. http://www.flickr.com/photos/mariachily/5250487136, Licensed CC
  • 28. Distributed System Community • Based around ACM, IEEE, and USENIX • Well known journals – IBM System journal, ACM Operating Systems Review, ACM Transactions on Computer Systems, IEEE Distributed Systems Online, IEEE Transactions on Parallel and Distributed Systems • Conferences – Theory: ICDCS, SPDC – SOA/Cloud : ICWS – E-Science, Parallel Programming : HPDC, SC, E- Science, Ccgrid – Systems : USENIX, Middleware, ACM Symposium on Operating Systems Principles, FAST, LISA, OSDI – DB : Sigmoid record, VLDB • Awards – Turing Award – Edsger W. Dijkstra Prize in Distributed Computing http://www.flickr.com/photos/dullhunk/4187914071, http://www.foto pedia.com/items/flickr-1544709148, Licensed CC
  • 29. Few Must Read Papers • System Structure for Software Fault Tolerance (1975) • Reaching Agreement in the Presence of Faults (1980) • Time, Clocks, and the Ordering of Events in a Distributed System (1978) • Reaching agreement in the presence of faults(1980) and The Byzantine generals problem” (1982), • End-to-End Arguments in System Design (1984) • A Note on Distributed Computing (1994) • Scale in Distributed Systems, (1994) • The Google File System (2003) • Chord: A Scalable Peer-to-peer Lookup Service for Internet Applications, (2001) • The Google file system (2003) • Xen and the Art of virtualization (2003) • MapReduce: Simplified Data Processing on Large Clusters (2004)
  • 30. Some Open Challenges • Every thing Data: Analytics, AI, Data Mining (Distributed versions of many algorithms) • Complex Event Processing (CEP) • How to Scale? • Middleware for the Cloud • Scalable Storage • Provenance • Workflows • Guard against DDoS and other http://www.flickr.com/photos/brianscott/5474210001, Distributed Security Issues Licensed CC
  • 31. Questions? Copyright by romainguy, and licensed for reuse under CC License http://www.flickr.com/photos/romainguy/249370084