SlideShare a Scribd company logo
1 of 54
Download to read offline
David Rich
April 2011




     The Onset of Parallelism


             Changes in computer architecture and
                Microsoft’s role in the transition
Your introduction – some
questions…
!  What kind of software do you see
   yourself working on in the future?
  Scientific? Web? Games? Business?
!  Have you worked on a distributed app?
   MPI?
!  Have you used Visual Studio?
!  Which will limit performance in the
   future:
  Power consumption? Latency? Lack of
  parallelism? Bugs?
!   Made in 1922 by Robert
    Flaherty
!   Considered to be the
    first full length
    documentary -though
    some scenes were
    staged
!   http://en.wikipedia.org/
    wiki/
    Nanook_of_the_north
Job Specialization
Bricklayer / Masonry               Industrial Pipefitter
Carpenter                          (construction)
Caulker / Pointer / Cleaners       Industrial Welder
Cement Mason                       (construction)
Construction Lineman               Ironworker, Structural
Drywall Finisher/Taper             Laborer
                                   Marble Setter, Masonry
Electrician, Elevator Mechanic
                                   Millwright Construction
Electrician, HVAC--
Environmental Control System       Machinery Erector
Servicer & Installer               Operating Engineer
Electrician, General               Painter--Decorator / Traffic
Journeyman (Inside)                Control Painter
Electrician, Limited Energy        Pile Driver
Technician A                       Pipefitter
                                                                  •  What about?
Electrician, Limited Energy        Plasterer
Technician B                       Plumber                           –  Architect
Electrician, Limited Renewable     Renewable Energy Technician
Energy Technician                                                    –  Surveyor
                                   Roofer
Electrician, Limited Residential
                                   Scaffold Erector                  –  Inspector
Electrician, Sign Maker-
                                   Sheet Metal Worker
Erector / Sign Hanger / Sign
                                   Solar Heating/Cooling
                                                                  •  Or people that work in
Assembler-Fabricator
Exterior/Interior Specialist
                                   Systems Installer                 the companies that
                                   Sprinkler Fitter
(metal framing & drywall)
                                   Steamfitter
                                                                     produce pre-fab
Finisher, Masonry
Floorcoverer
                                   Technical Engineer                components?
                                   Terrazzo Worker, Masonry
Glazier (construction)                                               –  Pipes, wires, windows,
Heat / Frost Insulator             Tilesetter, Masonry
                                   Tree Trimmer, Power Line             fixtures, etc.
Heavy Duty Repairer
                                   Truck Driver (Heavy)
 Guggenheim
Museum in
Bilbao

Acorn pre-fab
house 
Preparing for the Future – What Will
 Your Machine Look Like in 5 to 10
 Years?
!   Look at the Top500, predict and divide:
   1. At any point in time, most organizations can
          afford a machine which is 1/1000th the size of
          the #1 machine on the Top500
   2. Exaflop comes from 2x efficiency, 2x frequency and 100x the

          cores

         Today’s #1    Test: Is this within   Exaflop        Your Future
         Tianhe-1A     your budget?                          Platform
                       (1/1000th)
 Perf:   2.5 PFs       250 TFs                1000PFs        1PF
 Nodes 7,168           7                      500,000?       500?
 Cores   X86: 86,016   X86: 86 -- ~14 Xeons 130 Million      130
         GPU:          GPU: 3,211 -- ~7 Tesla                Thousand
         3,211,164                                           Cores…
Core Counts On the Rise
3,500,000
                         Number of Cores in Top500 #1 Over Time           Tianhe-1A
                         GPUs Get to #1...
3,000,000
               250,000
                                                                               Jaguar


2,500,000200,000



2,000,000
               150,000                                        Blue
   Cores




                                                              Gene
1,500,000                                                                 RoadRunner
               100,000
1,000,000

                50,000
 500,000                                          ASCI    Earth
                                         ASCI Red White Simulator
                               Fujitsu
                    -
           -
                 Jun 93
                 Nov 93
                 Jun 94
                 Nov 94
                 Jun 95
                 Nov 95
                 Jun 96
                 Nov 96
                 Jun 97
                 Nov 97
                 Jun 98
                 Nov 98
                 Jun 99
                 Nov 99
                 Jun 00
                 Nov 00
                 Jun 01
                 Nov 01
                 Jun 02
                 Nov 02
                 Jun 03
                 Nov 03
                 Jun 05
                 Jun 05
                 Nov 05
                 Jun 06
                 Nov 06
                 Jun 07
                 Nov 07
                 Jun 08
                 Nov 08
                June 09
                 Nov 09
                 Jun 10
                 Nov 10
                                                                     14
Good News:
  Everybody gets a Petaflop!


Bad News:
  You have to find 200,000 way parallelism
Caveat: No biology since high school…
Niche vs. Commodity Computing in
   HPC “Perfect
                  Predator”
Homogeneity       Performance growth
                  with decreasing cost
                  and no code        Commodity
                  changes.
                                     Clusters
                                                                               ?
                                     Horizontal
                                     Industry
                                     64bit x86 + Linux
                       Cluster of SMP IBM, Dell–many
                                     HP,           Commodity Clusters
                       RISC + *nix + others        Plus:
                       MPI                         GPU, Multicore, Cloud,
                       IBM, Digital,               FPGA, “big data” &
     Vertically Integrated
                       SGI…                        Windows!
     Single Machines
     IBM, Digital, Cray,
     HP
     (Apollo, Data General,
                                                                               ?
     Prime, Masscomp,
     Gould…)
       80’s                   90’s        00’s            10’s              20’s
www.calxeda.com
2 years
                                                6 years       12 MM users
                                                2 Bil emails/day
                                    7 years
                                    5 Bil conf mins/yr.
                        11 years
               Update   12 Bil queries/mo.

              12 years
              40 Petabytes/
              mo.
       13 years             500 Million active Windows Live IDs
       550 MM users/
                            9.9 Billion messages / day via WL Messenger
       mo.
                           Over 1 Million BPOS Users in 36 Countries
15 years
450 MM users
Microsoft’s Datacenter Evolution

Datacenter Co-         Quincy and San   Chicago and Dublin     Modular Datacenter
   Location                Antonio         Generation 3          Generation 4
 Generation 1           Generation 2




                                                                Facility PAC




  Server


            Capacity

                                                             Time to Market
                                                             Lower TCO
Generation 3 - Chicago Data Center
   $500M+ investment           1.5 million person hours-of-labor

  3000 construction related jobs                                         3400 tons of steel

  707,000 sq ft                                                       190 miles of conduit

 2400 tons of copper




7.5 miles of chilled water piping                     26,000 cubic yards of concrete
Visual Studio

!   Visual Studio is used by over half of the professional
    programmers in the world
!   VS2010 – released a year ago – has been downloaded
    over 7 million times (more than 4 million extension
    downloads)
!   Main point: when we release a new capability into Visual
    Studio it automatically gets large adoption
!   (story about the ISC developers)
Microsoft and GPUs

                     The volume
                     business….
GPU Hardware Evolution

Year   Version     Defining Feature
1996   DirectX3    Hardware rasterization
1997   DirectX5    2 Shading options to select
1998   DirectX6    Multi-texture       operations
1999   DirectX7    Vertex Processing in hardware
2000   DirectX8    Programmable Shaders: Vertex and Pixel
2002   DirectX9    High Level Shading Language, 32 instr
2003   DirectX9c   1000s of instructions per shader
2006   DirectX10   Unified Shaders: consistent shader models
2009   DirectX11   Compute Shader: explicit SIMD, random I/O
The GPGPU Software Stack
                              High level tools and
!  Windows has broad                libraries
   support at all levels:    PGI “x86 CUDA”, CAPS,
                               Culatools, Volara,
   • Supports all HW               Acceleware
   • Each of CUDA,

       OpenCL and            Low Level Programming
       DirectCompute            CUDA, OpenCL,
                                 DirectCompute
   • Almost all high level

       tools and libraries
                                    Hardware
                              GPU: AMD & NVIDIA
                             Mullticore x86: AMD &
                                      Intel
DirectCompute	
  
!  What	
  is	
  DirectCompute?	
  
   • Microso3’s	
  GPGPU	
  Programming	
  Solu<on	
  
   • API	
  of	
  the	
  DirectX	
  Family	
  


   • Component	
  of	
  the	
  Direct3D	
  API	
  



!  Why	
  Use	
  DirectCompute	
  Over	
  Other	
  APIs?	
  
   • Interoperability	
  with	
  rest	
  of	
  2D,	
  3D,	
  Video	
  rendering	
  APIs	
  
        (display	
  computed	
  results)	
  
   • Cross-­‐hardware	
  compa<bility	
  


   • Feature	
  compa<bility	
  guarantees	
  


   • Access	
  to	
  fixed-­‐func<on	
  hardware	
  



!  Used	
  extensively	
  by	
  the	
  gaming	
  community	
  
          http://msdn.microsoft.com/directx
GPGPU Development on Windows

!  Choice: CUDA, OpenCL or DirectCompute
!  Tools and libraries;
  Nsight and Visual Studio, PGI, CAPS, MATLAB,
  Jacket, PyCUDA, Quantifi, CUDA.NET, Culatools,
  NAG, Scicomp…
  many others
!  NVIDIA reports that over 80% of CUDA SDK
   downloads are for Windows
Microsoft and NVIDIA

NVIDIA’s Parallel Nsight is integrated
with Microsoft’s Visual Studio
MATLAB
                                          Computer Cluster
    Desktop Computer             MATLAB Distributed Computing Server

Parallel Computing Toolbox




                              Windows
                             HPC Server




                                                  Workers
Cluster           HPC                        ISV / OSS
               Excel                                             MPI
                               SOA           Applications                 Applications



                                                HPC
                                          Middleware Pack                       SOA




                           HPC Edition
                                               Operating
                                               Systems




                 On Premise Cluster            Computing	
  




*Note that in SP1 support for MPI applications on Azure does not exist.
Performance Parity Between
Linux and Windows
                                                    1 Million active Cells, 1000 wells, Blackoil

                         5500
                         5000
Elapsed Time [secs]




                         4500
                         4000
                         3500
                         3000
                         2500
                         2000
                         1500
                         1000
                          500

Cores                             1         2             4           6           8          16      24       32       48
RedHat 5 U3                     5200.43   3385.17      3095.72     2281.25     1790.59    1014.42   776.71   638.43   621.42
Win HPC R2 SP1                  5404.38   3298.55       3175.9     2171.37     1736.11     992.82   745.43   610.88   549.74




                      Make your choice based on features and TCO…
NEW
!   Connects to the cluster as a SOA client
      Excel SOA Client   !   VSTO code in workbook calls out to SOA Service
                         !   Input and output managed by Excel developer



                         !       Run multiple instances of Excel 2010 on an HPC Cluster
  Excel Workbook on      !       Each instance runs an iteration of the same workbook
     the Cluster         !       Can be launched from Excel 2010 or a Windows program
NEW                      !       Excel Dialog Suppression


                         !   Run User Defined Functions in parallel on a cluster
                         !   Excel 2010 includes a new API and options for HPC
      Excel UDF on the       cluster
          Cluster        !   Support for .XLL files developed through Excel SDK
NEW
                         !   Easy to develop on a desktop and then deploy to a cluster
!   Use Azure servers to run HPC compute Jobs
  !   Can be used to “burst-out” to the cloud to handle peak demand
  !   Can create clusters that include dedicated on-premise servers, non-dedicated
      workstations and shared Azure servers
      !    Jobs can run unchanged across all 3 types of compute nodes (no support for MPI in SP1)
      !    Azure nodes are added to cluster using the Administration console (just like Workstation nodes)


HPC Clients

                                                                                       Azure

                                       Head & Broker Nodes

                Jobs
              Requests

                                           Azure Gateway
Compute Nodes On-Premise and in Azure Simultaneously



   HPC Head Node
                                             Desktops
                                                                    • “Burst” into cloud on-
                                                                    demand while keeping
                                                                    control over data and
                                                                    corporate policies
    Broker Node
                          On-premise                Compute Nodes   • Pay only for what you use
                                                                    • A stepping stone to hybrid
                                                        Azure       and public clouds.
                                                                    • Dynamically adjust how
                                                          Azure     much runs on-premise and
                                                                    in the cloud
                  Compute Proxies



                                    Compute Instances
Parallel Development




 “Combined with Intel Parallel Studio, I think it is
 reasonable to say that Windows has the richest and most
 complete set of tools for multicore programming”. --
 James Reinders, Intel, 12-April-2010
Solution Begins with DEVELOPERS
                  Make it easier to express
                       and manage the
                 correctness, efficiency and
                      maintainability of
                  parallelism on Microsoft
                 platforms for developers of
                        all skill levels



Enable developers to                             Simplify the
 express parallelism                              process of
 easily and focus on                           designing and
                          Improve the          testing parallel
  the problem to be      efficiency and
        solved                                   applications
                          scalability of
                             parallel
                          applications
Visual Studio 2010
Tools, Programming Models, Runtimes
Tools        Programming Models

                   Parallel LINQ                                                  Parallel
 Parallel
                                                                                                Agents
Debugger
                   Task Parallel                                                  Pattern
  Tool                                                                                          Library
                     Library                                                      Library




                                                                Data Structures
                                              Data Structures
Windows
 Visual
 Studio         .NET Framework 4                                                   Visual C++ 10
                                                                                        Concurrency Runtime
  IDE
  Profiler          ThreadPool
Concurrenc                                                                           Task Scheduler
    y              Task Scheduler
 Analysis
                 Resource Manager                                                  Resource Manager

Operating
                                                                                                UMS
System
                                        Windows
                                  Threads
                                                                                               Threads

                       Managed       Native                              Tooling
World’s Fastest House Construction
      Three and a Half Hours
http://www.microsoft.com/
hpc
              David Rich
        darich at microsoft.com
© 2010 Microsoft Corporation. All rights reserved. Microsoft, Windows, Windows Vista and other product names are or may be registered trademarks and/or trademarks in the U.S. and/or other countries.
The information herein is for informational purposes only and represents the current view of Microsoft Corporation as of the date of this presentation. Because Microsoft must respond to changing market
conditions, it should not be interpreted to be a commitment on the part of Microsoft, and Microsoft cannot guarantee the accuracy of any information provided after the date of this presentation.
MICROSOFT MAKES NO WARRANTIES, EXPRESS, IMPLIED OR STATUTORY, AS TO THE INFORMATION IN THIS PRESENTATION.

More Related Content

Viewers also liked

High-Performance Computing Needs Machine Learning... And Vice Versa (NIPS 201...
High-Performance Computing Needs Machine Learning... And Vice Versa (NIPS 201...High-Performance Computing Needs Machine Learning... And Vice Versa (NIPS 201...
High-Performance Computing Needs Machine Learning... And Vice Versa (NIPS 201...npinto
 
[Harvard CS264] 11b - Analysis-Driven Performance Optimization with CUDA (Cli...
[Harvard CS264] 11b - Analysis-Driven Performance Optimization with CUDA (Cli...[Harvard CS264] 11b - Analysis-Driven Performance Optimization with CUDA (Cli...
[Harvard CS264] 11b - Analysis-Driven Performance Optimization with CUDA (Cli...npinto
 
[Harvard CS264] 07 - GPU Cluster Programming (MPI & ZeroMQ)
[Harvard CS264] 07 - GPU Cluster Programming (MPI & ZeroMQ)[Harvard CS264] 07 - GPU Cluster Programming (MPI & ZeroMQ)
[Harvard CS264] 07 - GPU Cluster Programming (MPI & ZeroMQ)npinto
 
[Harvard CS264] 10a - Easy, Effective, Efficient: GPU Programming in Python w...
[Harvard CS264] 10a - Easy, Effective, Efficient: GPU Programming in Python w...[Harvard CS264] 10a - Easy, Effective, Efficient: GPU Programming in Python w...
[Harvard CS264] 10a - Easy, Effective, Efficient: GPU Programming in Python w...npinto
 
[Harvard CS264] 08b - MapReduce and Hadoop (Zak Stone, Harvard)
[Harvard CS264] 08b - MapReduce and Hadoop (Zak Stone, Harvard)[Harvard CS264] 08b - MapReduce and Hadoop (Zak Stone, Harvard)
[Harvard CS264] 08b - MapReduce and Hadoop (Zak Stone, Harvard)npinto
 
[Harvard CS264] 09 - Machine Learning on Big Data: Lessons Learned from Googl...
[Harvard CS264] 09 - Machine Learning on Big Data: Lessons Learned from Googl...[Harvard CS264] 09 - Machine Learning on Big Data: Lessons Learned from Googl...
[Harvard CS264] 09 - Machine Learning on Big Data: Lessons Learned from Googl...npinto
 

Viewers also liked (6)

High-Performance Computing Needs Machine Learning... And Vice Versa (NIPS 201...
High-Performance Computing Needs Machine Learning... And Vice Versa (NIPS 201...High-Performance Computing Needs Machine Learning... And Vice Versa (NIPS 201...
High-Performance Computing Needs Machine Learning... And Vice Versa (NIPS 201...
 
[Harvard CS264] 11b - Analysis-Driven Performance Optimization with CUDA (Cli...
[Harvard CS264] 11b - Analysis-Driven Performance Optimization with CUDA (Cli...[Harvard CS264] 11b - Analysis-Driven Performance Optimization with CUDA (Cli...
[Harvard CS264] 11b - Analysis-Driven Performance Optimization with CUDA (Cli...
 
[Harvard CS264] 07 - GPU Cluster Programming (MPI & ZeroMQ)
[Harvard CS264] 07 - GPU Cluster Programming (MPI & ZeroMQ)[Harvard CS264] 07 - GPU Cluster Programming (MPI & ZeroMQ)
[Harvard CS264] 07 - GPU Cluster Programming (MPI & ZeroMQ)
 
[Harvard CS264] 10a - Easy, Effective, Efficient: GPU Programming in Python w...
[Harvard CS264] 10a - Easy, Effective, Efficient: GPU Programming in Python w...[Harvard CS264] 10a - Easy, Effective, Efficient: GPU Programming in Python w...
[Harvard CS264] 10a - Easy, Effective, Efficient: GPU Programming in Python w...
 
[Harvard CS264] 08b - MapReduce and Hadoop (Zak Stone, Harvard)
[Harvard CS264] 08b - MapReduce and Hadoop (Zak Stone, Harvard)[Harvard CS264] 08b - MapReduce and Hadoop (Zak Stone, Harvard)
[Harvard CS264] 08b - MapReduce and Hadoop (Zak Stone, Harvard)
 
[Harvard CS264] 09 - Machine Learning on Big Data: Lessons Learned from Googl...
[Harvard CS264] 09 - Machine Learning on Big Data: Lessons Learned from Googl...[Harvard CS264] 09 - Machine Learning on Big Data: Lessons Learned from Googl...
[Harvard CS264] 09 - Machine Learning on Big Data: Lessons Learned from Googl...
 

Similar to [Harvard CS264] 15a - The Onset of Parallelism, Changes in Computer Architecture and Microsoft's Role in the Transition (David Rich, Microsoft Research)

3dfx, nvidia, Moore's Law and more...
3dfx, nvidia, Moore's Law and more...3dfx, nvidia, Moore's Law and more...
3dfx, nvidia, Moore's Law and more...Azul Systems
 
Teleforge Client Conference, Nov 2018, Forces Driving Innovation
Teleforge Client Conference, Nov 2018, Forces Driving InnovationTeleforge Client Conference, Nov 2018, Forces Driving Innovation
Teleforge Client Conference, Nov 2018, Forces Driving InnovationFrancois Van Der Merwe
 
Progress: Despite the Law of Diminishing Returns
Progress: Despite the Law of Diminishing ReturnsProgress: Despite the Law of Diminishing Returns
Progress: Despite the Law of Diminishing ReturnsIan Phillips
 
Evento Startup Essential Barcelona
Evento Startup Essential BarcelonaEvento Startup Essential Barcelona
Evento Startup Essential BarcelonaManuel Jaffrin
 
Mixed Signal ASIC Wearable Tech - Making Babies with CMOS
Mixed Signal ASIC Wearable Tech - Making Babies with CMOSMixed Signal ASIC Wearable Tech - Making Babies with CMOS
Mixed Signal ASIC Wearable Tech - Making Babies with CMOSTriad Semiconductor
 
Productive parallel programming for intel xeon phi coprocessors
Productive parallel programming for intel xeon phi coprocessorsProductive parallel programming for intel xeon phi coprocessors
Productive parallel programming for intel xeon phi coprocessorsinside-BigData.com
 
Semefab Presentation
Semefab PresentationSemefab Presentation
Semefab Presentationeldiablo1
 
Carving the Perfect Engineer (EWME'16, 11may16)
Carving the Perfect Engineer (EWME'16, 11may16)Carving the Perfect Engineer (EWME'16, 11may16)
Carving the Perfect Engineer (EWME'16, 11may16)Ian Phillips
 
Big Science, Big Data: Simon Metson at Eduserv Symposium 2012
Big Science, Big Data: Simon Metson at Eduserv Symposium 2012Big Science, Big Data: Simon Metson at Eduserv Symposium 2012
Big Science, Big Data: Simon Metson at Eduserv Symposium 2012Eduserv
 
Bug Labs - Overview
Bug Labs - OverviewBug Labs - Overview
Bug Labs - Overviewbuglabs
 
PCB Fabrication Manufacturer Corporate Brochure
PCB Fabrication Manufacturer Corporate BrochurePCB Fabrication Manufacturer Corporate Brochure
PCB Fabrication Manufacturer Corporate BrochureDomestic PCB Fabrication
 
Lec Jan12 2009
Lec Jan12 2009Lec Jan12 2009
Lec Jan12 2009Ravi Soni
 
Running deep learning onto heterogenous hardware
Running deep learning onto heterogenous hardwareRunning deep learning onto heterogenous hardware
Running deep learning onto heterogenous hardwareLauraCalem
 
Kickstaring the transition to parallel computing with open hardware
Kickstaring the transition to parallel computing with open hardwareKickstaring the transition to parallel computing with open hardware
Kickstaring the transition to parallel computing with open hardwareAndreas Olofsson
 
Bending The Curve
Bending The CurveBending The Curve
Bending The Curvefinteligent
 
Appsterdam talk - about the chips inside your phone
Appsterdam talk - about the chips inside your phoneAppsterdam talk - about the chips inside your phone
Appsterdam talk - about the chips inside your phonemarcocjacobs
 

Similar to [Harvard CS264] 15a - The Onset of Parallelism, Changes in Computer Architecture and Microsoft's Role in the Transition (David Rich, Microsoft Research) (20)

3dfx, nvidia, Moore's Law and more...
3dfx, nvidia, Moore's Law and more...3dfx, nvidia, Moore's Law and more...
3dfx, nvidia, Moore's Law and more...
 
Printing in 3D
Printing in 3DPrinting in 3D
Printing in 3D
 
Teleforge Client Conference, Nov 2018, Forces Driving Innovation
Teleforge Client Conference, Nov 2018, Forces Driving InnovationTeleforge Client Conference, Nov 2018, Forces Driving Innovation
Teleforge Client Conference, Nov 2018, Forces Driving Innovation
 
Progress: Despite the Law of Diminishing Returns
Progress: Despite the Law of Diminishing ReturnsProgress: Despite the Law of Diminishing Returns
Progress: Despite the Law of Diminishing Returns
 
Evento Startup Essential Barcelona
Evento Startup Essential BarcelonaEvento Startup Essential Barcelona
Evento Startup Essential Barcelona
 
Mixed Signal ASIC Wearable Tech - Making Babies with CMOS
Mixed Signal ASIC Wearable Tech - Making Babies with CMOSMixed Signal ASIC Wearable Tech - Making Babies with CMOS
Mixed Signal ASIC Wearable Tech - Making Babies with CMOS
 
Productive parallel programming for intel xeon phi coprocessors
Productive parallel programming for intel xeon phi coprocessorsProductive parallel programming for intel xeon phi coprocessors
Productive parallel programming for intel xeon phi coprocessors
 
Semefab Presentation
Semefab PresentationSemefab Presentation
Semefab Presentation
 
Conferencia
ConferenciaConferencia
Conferencia
 
Conferencia
ConferenciaConferencia
Conferencia
 
Carving the Perfect Engineer (EWME'16, 11may16)
Carving the Perfect Engineer (EWME'16, 11may16)Carving the Perfect Engineer (EWME'16, 11may16)
Carving the Perfect Engineer (EWME'16, 11may16)
 
Big Science, Big Data: Simon Metson at Eduserv Symposium 2012
Big Science, Big Data: Simon Metson at Eduserv Symposium 2012Big Science, Big Data: Simon Metson at Eduserv Symposium 2012
Big Science, Big Data: Simon Metson at Eduserv Symposium 2012
 
Bug Labs - Overview
Bug Labs - OverviewBug Labs - Overview
Bug Labs - Overview
 
PCB Fabrication Manufacturer Corporate Brochure
PCB Fabrication Manufacturer Corporate BrochurePCB Fabrication Manufacturer Corporate Brochure
PCB Fabrication Manufacturer Corporate Brochure
 
Lec Jan12 2009
Lec Jan12 2009Lec Jan12 2009
Lec Jan12 2009
 
Running deep learning onto heterogenous hardware
Running deep learning onto heterogenous hardwareRunning deep learning onto heterogenous hardware
Running deep learning onto heterogenous hardware
 
RCIM 2008 - - ALTERA
RCIM 2008 - - ALTERARCIM 2008 - - ALTERA
RCIM 2008 - - ALTERA
 
Kickstaring the transition to parallel computing with open hardware
Kickstaring the transition to parallel computing with open hardwareKickstaring the transition to parallel computing with open hardware
Kickstaring the transition to parallel computing with open hardware
 
Bending The Curve
Bending The CurveBending The Curve
Bending The Curve
 
Appsterdam talk - about the chips inside your phone
Appsterdam talk - about the chips inside your phoneAppsterdam talk - about the chips inside your phone
Appsterdam talk - about the chips inside your phone
 

More from npinto

"AI" for Blockchain Security (Case Study: Cosmos)
"AI" for Blockchain Security (Case Study: Cosmos)"AI" for Blockchain Security (Case Study: Cosmos)
"AI" for Blockchain Security (Case Study: Cosmos)npinto
 
[Harvard CS264] 06 - CUDA Ninja Tricks: GPU Scripting, Meta-programming & Aut...
[Harvard CS264] 06 - CUDA Ninja Tricks: GPU Scripting, Meta-programming & Aut...[Harvard CS264] 06 - CUDA Ninja Tricks: GPU Scripting, Meta-programming & Aut...
[Harvard CS264] 06 - CUDA Ninja Tricks: GPU Scripting, Meta-programming & Aut...npinto
 
[Harvard CS264] 05 - Advanced-level CUDA Programming
[Harvard CS264] 05 - Advanced-level CUDA Programming[Harvard CS264] 05 - Advanced-level CUDA Programming
[Harvard CS264] 05 - Advanced-level CUDA Programmingnpinto
 
[Harvard CS264] 04 - Intermediate-level CUDA Programming
[Harvard CS264] 04 - Intermediate-level CUDA Programming[Harvard CS264] 04 - Intermediate-level CUDA Programming
[Harvard CS264] 04 - Intermediate-level CUDA Programmingnpinto
 
[Harvard CS264] 03 - Introduction to GPU Computing, CUDA Basics
[Harvard CS264] 03 - Introduction to GPU Computing, CUDA Basics[Harvard CS264] 03 - Introduction to GPU Computing, CUDA Basics
[Harvard CS264] 03 - Introduction to GPU Computing, CUDA Basicsnpinto
 
[Harvard CS264] 02 - Parallel Thinking, Architecture, Theory & Patterns
[Harvard CS264] 02 - Parallel Thinking, Architecture, Theory & Patterns[Harvard CS264] 02 - Parallel Thinking, Architecture, Theory & Patterns
[Harvard CS264] 02 - Parallel Thinking, Architecture, Theory & Patternsnpinto
 
[Harvard CS264] 01 - Introduction
[Harvard CS264] 01 - Introduction[Harvard CS264] 01 - Introduction
[Harvard CS264] 01 - Introductionnpinto
 
IAP09 CUDA@MIT 6.963 - Guest Lecture: Out-of-Core Programming with NVIDIA's C...
IAP09 CUDA@MIT 6.963 - Guest Lecture: Out-of-Core Programming with NVIDIA's C...IAP09 CUDA@MIT 6.963 - Guest Lecture: Out-of-Core Programming with NVIDIA's C...
IAP09 CUDA@MIT 6.963 - Guest Lecture: Out-of-Core Programming with NVIDIA's C...npinto
 
IAP09 CUDA@MIT 6.963 - Guest Lecture: CUDA Tricks and High-Performance Comput...
IAP09 CUDA@MIT 6.963 - Guest Lecture: CUDA Tricks and High-Performance Comput...IAP09 CUDA@MIT 6.963 - Guest Lecture: CUDA Tricks and High-Performance Comput...
IAP09 CUDA@MIT 6.963 - Guest Lecture: CUDA Tricks and High-Performance Comput...npinto
 
IAP09 CUDA@MIT 6.963 - Lecture 07: CUDA Advanced #2 (Nicolas Pinto, MIT)
IAP09 CUDA@MIT 6.963 - Lecture 07: CUDA Advanced #2 (Nicolas Pinto, MIT)IAP09 CUDA@MIT 6.963 - Lecture 07: CUDA Advanced #2 (Nicolas Pinto, MIT)
IAP09 CUDA@MIT 6.963 - Lecture 07: CUDA Advanced #2 (Nicolas Pinto, MIT)npinto
 
MIT 6.870 - Template Matching and Histograms (Nicolas Pinto, MIT)
MIT 6.870 - Template Matching and Histograms (Nicolas Pinto, MIT)MIT 6.870 - Template Matching and Histograms (Nicolas Pinto, MIT)
MIT 6.870 - Template Matching and Histograms (Nicolas Pinto, MIT)npinto
 
IAP09 CUDA@MIT 6.963 - Lecture 04: CUDA Advanced #1 (Nicolas Pinto, MIT)
IAP09 CUDA@MIT 6.963 - Lecture 04: CUDA Advanced #1 (Nicolas Pinto, MIT)IAP09 CUDA@MIT 6.963 - Lecture 04: CUDA Advanced #1 (Nicolas Pinto, MIT)
IAP09 CUDA@MIT 6.963 - Lecture 04: CUDA Advanced #1 (Nicolas Pinto, MIT)npinto
 
IAP09 CUDA@MIT 6.963 - Lecture 03: CUDA Basics #2 (Nicolas Pinto, MIT)
IAP09 CUDA@MIT 6.963 - Lecture 03: CUDA Basics #2 (Nicolas Pinto, MIT)IAP09 CUDA@MIT 6.963 - Lecture 03: CUDA Basics #2 (Nicolas Pinto, MIT)
IAP09 CUDA@MIT 6.963 - Lecture 03: CUDA Basics #2 (Nicolas Pinto, MIT)npinto
 
IAP09 CUDA@MIT 6.963 - Lecture 02: CUDA Basics #1 (Nicolas Pinto, MIT)
IAP09 CUDA@MIT 6.963 - Lecture 02: CUDA Basics #1 (Nicolas Pinto, MIT)IAP09 CUDA@MIT 6.963 - Lecture 02: CUDA Basics #1 (Nicolas Pinto, MIT)
IAP09 CUDA@MIT 6.963 - Lecture 02: CUDA Basics #1 (Nicolas Pinto, MIT)npinto
 
IAP09 CUDA@MIT 6.963 - Lecture 01: GPU Computing using CUDA (David Luebke, NV...
IAP09 CUDA@MIT 6.963 - Lecture 01: GPU Computing using CUDA (David Luebke, NV...IAP09 CUDA@MIT 6.963 - Lecture 01: GPU Computing using CUDA (David Luebke, NV...
IAP09 CUDA@MIT 6.963 - Lecture 01: GPU Computing using CUDA (David Luebke, NV...npinto
 
IAP09 CUDA@MIT 6.963 - Lecture 01: High-Throughput Scientific Computing (Hans...
IAP09 CUDA@MIT 6.963 - Lecture 01: High-Throughput Scientific Computing (Hans...IAP09 CUDA@MIT 6.963 - Lecture 01: High-Throughput Scientific Computing (Hans...
IAP09 CUDA@MIT 6.963 - Lecture 01: High-Throughput Scientific Computing (Hans...npinto
 

More from npinto (16)

"AI" for Blockchain Security (Case Study: Cosmos)
"AI" for Blockchain Security (Case Study: Cosmos)"AI" for Blockchain Security (Case Study: Cosmos)
"AI" for Blockchain Security (Case Study: Cosmos)
 
[Harvard CS264] 06 - CUDA Ninja Tricks: GPU Scripting, Meta-programming & Aut...
[Harvard CS264] 06 - CUDA Ninja Tricks: GPU Scripting, Meta-programming & Aut...[Harvard CS264] 06 - CUDA Ninja Tricks: GPU Scripting, Meta-programming & Aut...
[Harvard CS264] 06 - CUDA Ninja Tricks: GPU Scripting, Meta-programming & Aut...
 
[Harvard CS264] 05 - Advanced-level CUDA Programming
[Harvard CS264] 05 - Advanced-level CUDA Programming[Harvard CS264] 05 - Advanced-level CUDA Programming
[Harvard CS264] 05 - Advanced-level CUDA Programming
 
[Harvard CS264] 04 - Intermediate-level CUDA Programming
[Harvard CS264] 04 - Intermediate-level CUDA Programming[Harvard CS264] 04 - Intermediate-level CUDA Programming
[Harvard CS264] 04 - Intermediate-level CUDA Programming
 
[Harvard CS264] 03 - Introduction to GPU Computing, CUDA Basics
[Harvard CS264] 03 - Introduction to GPU Computing, CUDA Basics[Harvard CS264] 03 - Introduction to GPU Computing, CUDA Basics
[Harvard CS264] 03 - Introduction to GPU Computing, CUDA Basics
 
[Harvard CS264] 02 - Parallel Thinking, Architecture, Theory & Patterns
[Harvard CS264] 02 - Parallel Thinking, Architecture, Theory & Patterns[Harvard CS264] 02 - Parallel Thinking, Architecture, Theory & Patterns
[Harvard CS264] 02 - Parallel Thinking, Architecture, Theory & Patterns
 
[Harvard CS264] 01 - Introduction
[Harvard CS264] 01 - Introduction[Harvard CS264] 01 - Introduction
[Harvard CS264] 01 - Introduction
 
IAP09 CUDA@MIT 6.963 - Guest Lecture: Out-of-Core Programming with NVIDIA's C...
IAP09 CUDA@MIT 6.963 - Guest Lecture: Out-of-Core Programming with NVIDIA's C...IAP09 CUDA@MIT 6.963 - Guest Lecture: Out-of-Core Programming with NVIDIA's C...
IAP09 CUDA@MIT 6.963 - Guest Lecture: Out-of-Core Programming with NVIDIA's C...
 
IAP09 CUDA@MIT 6.963 - Guest Lecture: CUDA Tricks and High-Performance Comput...
IAP09 CUDA@MIT 6.963 - Guest Lecture: CUDA Tricks and High-Performance Comput...IAP09 CUDA@MIT 6.963 - Guest Lecture: CUDA Tricks and High-Performance Comput...
IAP09 CUDA@MIT 6.963 - Guest Lecture: CUDA Tricks and High-Performance Comput...
 
IAP09 CUDA@MIT 6.963 - Lecture 07: CUDA Advanced #2 (Nicolas Pinto, MIT)
IAP09 CUDA@MIT 6.963 - Lecture 07: CUDA Advanced #2 (Nicolas Pinto, MIT)IAP09 CUDA@MIT 6.963 - Lecture 07: CUDA Advanced #2 (Nicolas Pinto, MIT)
IAP09 CUDA@MIT 6.963 - Lecture 07: CUDA Advanced #2 (Nicolas Pinto, MIT)
 
MIT 6.870 - Template Matching and Histograms (Nicolas Pinto, MIT)
MIT 6.870 - Template Matching and Histograms (Nicolas Pinto, MIT)MIT 6.870 - Template Matching and Histograms (Nicolas Pinto, MIT)
MIT 6.870 - Template Matching and Histograms (Nicolas Pinto, MIT)
 
IAP09 CUDA@MIT 6.963 - Lecture 04: CUDA Advanced #1 (Nicolas Pinto, MIT)
IAP09 CUDA@MIT 6.963 - Lecture 04: CUDA Advanced #1 (Nicolas Pinto, MIT)IAP09 CUDA@MIT 6.963 - Lecture 04: CUDA Advanced #1 (Nicolas Pinto, MIT)
IAP09 CUDA@MIT 6.963 - Lecture 04: CUDA Advanced #1 (Nicolas Pinto, MIT)
 
IAP09 CUDA@MIT 6.963 - Lecture 03: CUDA Basics #2 (Nicolas Pinto, MIT)
IAP09 CUDA@MIT 6.963 - Lecture 03: CUDA Basics #2 (Nicolas Pinto, MIT)IAP09 CUDA@MIT 6.963 - Lecture 03: CUDA Basics #2 (Nicolas Pinto, MIT)
IAP09 CUDA@MIT 6.963 - Lecture 03: CUDA Basics #2 (Nicolas Pinto, MIT)
 
IAP09 CUDA@MIT 6.963 - Lecture 02: CUDA Basics #1 (Nicolas Pinto, MIT)
IAP09 CUDA@MIT 6.963 - Lecture 02: CUDA Basics #1 (Nicolas Pinto, MIT)IAP09 CUDA@MIT 6.963 - Lecture 02: CUDA Basics #1 (Nicolas Pinto, MIT)
IAP09 CUDA@MIT 6.963 - Lecture 02: CUDA Basics #1 (Nicolas Pinto, MIT)
 
IAP09 CUDA@MIT 6.963 - Lecture 01: GPU Computing using CUDA (David Luebke, NV...
IAP09 CUDA@MIT 6.963 - Lecture 01: GPU Computing using CUDA (David Luebke, NV...IAP09 CUDA@MIT 6.963 - Lecture 01: GPU Computing using CUDA (David Luebke, NV...
IAP09 CUDA@MIT 6.963 - Lecture 01: GPU Computing using CUDA (David Luebke, NV...
 
IAP09 CUDA@MIT 6.963 - Lecture 01: High-Throughput Scientific Computing (Hans...
IAP09 CUDA@MIT 6.963 - Lecture 01: High-Throughput Scientific Computing (Hans...IAP09 CUDA@MIT 6.963 - Lecture 01: High-Throughput Scientific Computing (Hans...
IAP09 CUDA@MIT 6.963 - Lecture 01: High-Throughput Scientific Computing (Hans...
 

Recently uploaded

ESP 4-EDITED.pdfmmcncncncmcmmnmnmncnmncmnnjvnnv
ESP 4-EDITED.pdfmmcncncncmcmmnmnmncnmncmnnjvnnvESP 4-EDITED.pdfmmcncncncmcmmnmnmncnmncmnnjvnnv
ESP 4-EDITED.pdfmmcncncncmcmmnmnmncnmncmnnjvnnvRicaMaeCastro1
 
Scientific Writing :Research Discourse
Scientific  Writing :Research  DiscourseScientific  Writing :Research  Discourse
Scientific Writing :Research DiscourseAnita GoswamiGiri
 
Decoding the Tweet _ Practical Criticism in the Age of Hashtag.pptx
Decoding the Tweet _ Practical Criticism in the Age of Hashtag.pptxDecoding the Tweet _ Practical Criticism in the Age of Hashtag.pptx
Decoding the Tweet _ Practical Criticism in the Age of Hashtag.pptxDhatriParmar
 
ClimART Action | eTwinning Project
ClimART Action    |    eTwinning ProjectClimART Action    |    eTwinning Project
ClimART Action | eTwinning Projectjordimapav
 
4.11.24 Poverty and Inequality in America.pptx
4.11.24 Poverty and Inequality in America.pptx4.11.24 Poverty and Inequality in America.pptx
4.11.24 Poverty and Inequality in America.pptxmary850239
 
ICS2208 Lecture6 Notes for SL spaces.pdf
ICS2208 Lecture6 Notes for SL spaces.pdfICS2208 Lecture6 Notes for SL spaces.pdf
ICS2208 Lecture6 Notes for SL spaces.pdfVanessa Camilleri
 
BIOCHEMISTRY-CARBOHYDRATE METABOLISM CHAPTER 2.pptx
BIOCHEMISTRY-CARBOHYDRATE METABOLISM CHAPTER 2.pptxBIOCHEMISTRY-CARBOHYDRATE METABOLISM CHAPTER 2.pptx
BIOCHEMISTRY-CARBOHYDRATE METABOLISM CHAPTER 2.pptxSayali Powar
 
Q-Factor HISPOL Quiz-6th April 2024, Quiz Club NITW
Q-Factor HISPOL Quiz-6th April 2024, Quiz Club NITWQ-Factor HISPOL Quiz-6th April 2024, Quiz Club NITW
Q-Factor HISPOL Quiz-6th April 2024, Quiz Club NITWQuiz Club NITW
 
Transaction Management in Database Management System
Transaction Management in Database Management SystemTransaction Management in Database Management System
Transaction Management in Database Management SystemChristalin Nelson
 
Grade 9 Quarter 4 Dll Grade 9 Quarter 4 DLL.pdf
Grade 9 Quarter 4 Dll Grade 9 Quarter 4 DLL.pdfGrade 9 Quarter 4 Dll Grade 9 Quarter 4 DLL.pdf
Grade 9 Quarter 4 Dll Grade 9 Quarter 4 DLL.pdfJemuel Francisco
 
4.11.24 Mass Incarceration and the New Jim Crow.pptx
4.11.24 Mass Incarceration and the New Jim Crow.pptx4.11.24 Mass Incarceration and the New Jim Crow.pptx
4.11.24 Mass Incarceration and the New Jim Crow.pptxmary850239
 
Q4-PPT-Music9_Lesson-1-Romantic-Opera.pptx
Q4-PPT-Music9_Lesson-1-Romantic-Opera.pptxQ4-PPT-Music9_Lesson-1-Romantic-Opera.pptx
Q4-PPT-Music9_Lesson-1-Romantic-Opera.pptxlancelewisportillo
 
Using Grammatical Signals Suitable to Patterns of Idea Development
Using Grammatical Signals Suitable to Patterns of Idea DevelopmentUsing Grammatical Signals Suitable to Patterns of Idea Development
Using Grammatical Signals Suitable to Patterns of Idea Developmentchesterberbo7
 
How to Fix XML SyntaxError in Odoo the 17
How to Fix XML SyntaxError in Odoo the 17How to Fix XML SyntaxError in Odoo the 17
How to Fix XML SyntaxError in Odoo the 17Celine George
 
Q-Factor General Quiz-7th April 2024, Quiz Club NITW
Q-Factor General Quiz-7th April 2024, Quiz Club NITWQ-Factor General Quiz-7th April 2024, Quiz Club NITW
Q-Factor General Quiz-7th April 2024, Quiz Club NITWQuiz Club NITW
 
Expanded definition: technical and operational
Expanded definition: technical and operationalExpanded definition: technical and operational
Expanded definition: technical and operationalssuser3e220a
 
Concurrency Control in Database Management system
Concurrency Control in Database Management systemConcurrency Control in Database Management system
Concurrency Control in Database Management systemChristalin Nelson
 
DIFFERENT BASKETRY IN THE PHILIPPINES PPT.pptx
DIFFERENT BASKETRY IN THE PHILIPPINES PPT.pptxDIFFERENT BASKETRY IN THE PHILIPPINES PPT.pptx
DIFFERENT BASKETRY IN THE PHILIPPINES PPT.pptxMichelleTuguinay1
 

Recently uploaded (20)

ESP 4-EDITED.pdfmmcncncncmcmmnmnmncnmncmnnjvnnv
ESP 4-EDITED.pdfmmcncncncmcmmnmnmncnmncmnnjvnnvESP 4-EDITED.pdfmmcncncncmcmmnmnmncnmncmnnjvnnv
ESP 4-EDITED.pdfmmcncncncmcmmnmnmncnmncmnnjvnnv
 
Scientific Writing :Research Discourse
Scientific  Writing :Research  DiscourseScientific  Writing :Research  Discourse
Scientific Writing :Research Discourse
 
Decoding the Tweet _ Practical Criticism in the Age of Hashtag.pptx
Decoding the Tweet _ Practical Criticism in the Age of Hashtag.pptxDecoding the Tweet _ Practical Criticism in the Age of Hashtag.pptx
Decoding the Tweet _ Practical Criticism in the Age of Hashtag.pptx
 
ClimART Action | eTwinning Project
ClimART Action    |    eTwinning ProjectClimART Action    |    eTwinning Project
ClimART Action | eTwinning Project
 
4.11.24 Poverty and Inequality in America.pptx
4.11.24 Poverty and Inequality in America.pptx4.11.24 Poverty and Inequality in America.pptx
4.11.24 Poverty and Inequality in America.pptx
 
ICS2208 Lecture6 Notes for SL spaces.pdf
ICS2208 Lecture6 Notes for SL spaces.pdfICS2208 Lecture6 Notes for SL spaces.pdf
ICS2208 Lecture6 Notes for SL spaces.pdf
 
Paradigm shift in nursing research by RS MEHTA
Paradigm shift in nursing research by RS MEHTAParadigm shift in nursing research by RS MEHTA
Paradigm shift in nursing research by RS MEHTA
 
BIOCHEMISTRY-CARBOHYDRATE METABOLISM CHAPTER 2.pptx
BIOCHEMISTRY-CARBOHYDRATE METABOLISM CHAPTER 2.pptxBIOCHEMISTRY-CARBOHYDRATE METABOLISM CHAPTER 2.pptx
BIOCHEMISTRY-CARBOHYDRATE METABOLISM CHAPTER 2.pptx
 
Q-Factor HISPOL Quiz-6th April 2024, Quiz Club NITW
Q-Factor HISPOL Quiz-6th April 2024, Quiz Club NITWQ-Factor HISPOL Quiz-6th April 2024, Quiz Club NITW
Q-Factor HISPOL Quiz-6th April 2024, Quiz Club NITW
 
Transaction Management in Database Management System
Transaction Management in Database Management SystemTransaction Management in Database Management System
Transaction Management in Database Management System
 
Grade 9 Quarter 4 Dll Grade 9 Quarter 4 DLL.pdf
Grade 9 Quarter 4 Dll Grade 9 Quarter 4 DLL.pdfGrade 9 Quarter 4 Dll Grade 9 Quarter 4 DLL.pdf
Grade 9 Quarter 4 Dll Grade 9 Quarter 4 DLL.pdf
 
4.11.24 Mass Incarceration and the New Jim Crow.pptx
4.11.24 Mass Incarceration and the New Jim Crow.pptx4.11.24 Mass Incarceration and the New Jim Crow.pptx
4.11.24 Mass Incarceration and the New Jim Crow.pptx
 
Q4-PPT-Music9_Lesson-1-Romantic-Opera.pptx
Q4-PPT-Music9_Lesson-1-Romantic-Opera.pptxQ4-PPT-Music9_Lesson-1-Romantic-Opera.pptx
Q4-PPT-Music9_Lesson-1-Romantic-Opera.pptx
 
Using Grammatical Signals Suitable to Patterns of Idea Development
Using Grammatical Signals Suitable to Patterns of Idea DevelopmentUsing Grammatical Signals Suitable to Patterns of Idea Development
Using Grammatical Signals Suitable to Patterns of Idea Development
 
How to Fix XML SyntaxError in Odoo the 17
How to Fix XML SyntaxError in Odoo the 17How to Fix XML SyntaxError in Odoo the 17
How to Fix XML SyntaxError in Odoo the 17
 
Q-Factor General Quiz-7th April 2024, Quiz Club NITW
Q-Factor General Quiz-7th April 2024, Quiz Club NITWQ-Factor General Quiz-7th April 2024, Quiz Club NITW
Q-Factor General Quiz-7th April 2024, Quiz Club NITW
 
Faculty Profile prashantha K EEE dept Sri Sairam college of Engineering
Faculty Profile prashantha K EEE dept Sri Sairam college of EngineeringFaculty Profile prashantha K EEE dept Sri Sairam college of Engineering
Faculty Profile prashantha K EEE dept Sri Sairam college of Engineering
 
Expanded definition: technical and operational
Expanded definition: technical and operationalExpanded definition: technical and operational
Expanded definition: technical and operational
 
Concurrency Control in Database Management system
Concurrency Control in Database Management systemConcurrency Control in Database Management system
Concurrency Control in Database Management system
 
DIFFERENT BASKETRY IN THE PHILIPPINES PPT.pptx
DIFFERENT BASKETRY IN THE PHILIPPINES PPT.pptxDIFFERENT BASKETRY IN THE PHILIPPINES PPT.pptx
DIFFERENT BASKETRY IN THE PHILIPPINES PPT.pptx
 

[Harvard CS264] 15a - The Onset of Parallelism, Changes in Computer Architecture and Microsoft's Role in the Transition (David Rich, Microsoft Research)

  • 1. David Rich April 2011 The Onset of Parallelism Changes in computer architecture and Microsoft’s role in the transition
  • 2.
  • 3. Your introduction – some questions… !  What kind of software do you see yourself working on in the future? Scientific? Web? Games? Business? !  Have you worked on a distributed app? MPI? !  Have you used Visual Studio? !  Which will limit performance in the future: Power consumption? Latency? Lack of parallelism? Bugs?
  • 4.
  • 5.
  • 6.
  • 7.
  • 8.
  • 9. !   Made in 1922 by Robert Flaherty !   Considered to be the first full length documentary -though some scenes were staged !   http://en.wikipedia.org/ wiki/ Nanook_of_the_north
  • 10.
  • 11. Job Specialization Bricklayer / Masonry Industrial Pipefitter Carpenter (construction) Caulker / Pointer / Cleaners Industrial Welder Cement Mason (construction) Construction Lineman Ironworker, Structural Drywall Finisher/Taper Laborer Marble Setter, Masonry Electrician, Elevator Mechanic Millwright Construction Electrician, HVAC-- Environmental Control System Machinery Erector Servicer & Installer Operating Engineer Electrician, General Painter--Decorator / Traffic Journeyman (Inside) Control Painter Electrician, Limited Energy Pile Driver Technician A Pipefitter •  What about? Electrician, Limited Energy Plasterer Technician B Plumber –  Architect Electrician, Limited Renewable Renewable Energy Technician Energy Technician –  Surveyor Roofer Electrician, Limited Residential Scaffold Erector –  Inspector Electrician, Sign Maker- Sheet Metal Worker Erector / Sign Hanger / Sign Solar Heating/Cooling •  Or people that work in Assembler-Fabricator Exterior/Interior Specialist Systems Installer the companies that Sprinkler Fitter (metal framing & drywall) Steamfitter produce pre-fab Finisher, Masonry Floorcoverer Technical Engineer components? Terrazzo Worker, Masonry Glazier (construction) –  Pipes, wires, windows, Heat / Frost Insulator Tilesetter, Masonry Tree Trimmer, Power Line fixtures, etc. Heavy Duty Repairer Truck Driver (Heavy)
  • 13. Preparing for the Future – What Will Your Machine Look Like in 5 to 10 Years? !   Look at the Top500, predict and divide: 1. At any point in time, most organizations can afford a machine which is 1/1000th the size of the #1 machine on the Top500 2. Exaflop comes from 2x efficiency, 2x frequency and 100x the cores Today’s #1 Test: Is this within Exaflop Your Future Tianhe-1A your budget? Platform (1/1000th) Perf: 2.5 PFs 250 TFs 1000PFs 1PF Nodes 7,168 7 500,000? 500? Cores X86: 86,016 X86: 86 -- ~14 Xeons 130 Million 130 GPU: GPU: 3,211 -- ~7 Tesla Thousand 3,211,164 Cores…
  • 14. Core Counts On the Rise 3,500,000 Number of Cores in Top500 #1 Over Time Tianhe-1A GPUs Get to #1... 3,000,000 250,000 Jaguar 2,500,000200,000 2,000,000 150,000 Blue Cores Gene 1,500,000 RoadRunner 100,000 1,000,000 50,000 500,000 ASCI Earth ASCI Red White Simulator Fujitsu - - Jun 93 Nov 93 Jun 94 Nov 94 Jun 95 Nov 95 Jun 96 Nov 96 Jun 97 Nov 97 Jun 98 Nov 98 Jun 99 Nov 99 Jun 00 Nov 00 Jun 01 Nov 01 Jun 02 Nov 02 Jun 03 Nov 03 Jun 05 Jun 05 Nov 05 Jun 06 Nov 06 Jun 07 Nov 07 Jun 08 Nov 08 June 09 Nov 09 Jun 10 Nov 10 14
  • 15. Good News: Everybody gets a Petaflop! Bad News: You have to find 200,000 way parallelism
  • 16.
  • 17.
  • 18. Caveat: No biology since high school…
  • 19. Niche vs. Commodity Computing in HPC “Perfect Predator” Homogeneity Performance growth with decreasing cost and no code Commodity changes. Clusters ? Horizontal Industry 64bit x86 + Linux Cluster of SMP IBM, Dell–many HP, Commodity Clusters RISC + *nix + others Plus: MPI GPU, Multicore, Cloud, IBM, Digital, FPGA, “big data” & Vertically Integrated SGI… Windows! Single Machines IBM, Digital, Cray, HP (Apollo, Data General, ? Prime, Masscomp, Gould…) 80’s 90’s 00’s 10’s 20’s
  • 21.
  • 22.
  • 23.
  • 24.
  • 25.
  • 26. 2 years 6 years 12 MM users 2 Bil emails/day 7 years 5 Bil conf mins/yr. 11 years Update 12 Bil queries/mo. 12 years 40 Petabytes/ mo. 13 years 500 Million active Windows Live IDs 550 MM users/ 9.9 Billion messages / day via WL Messenger mo. Over 1 Million BPOS Users in 36 Countries 15 years 450 MM users
  • 27. Microsoft’s Datacenter Evolution Datacenter Co- Quincy and San Chicago and Dublin Modular Datacenter Location Antonio Generation 3 Generation 4 Generation 1 Generation 2 Facility PAC Server Capacity Time to Market Lower TCO
  • 28. Generation 3 - Chicago Data Center $500M+ investment 1.5 million person hours-of-labor 3000 construction related jobs 3400 tons of steel 707,000 sq ft 190 miles of conduit 2400 tons of copper 7.5 miles of chilled water piping 26,000 cubic yards of concrete
  • 29. Visual Studio !   Visual Studio is used by over half of the professional programmers in the world !   VS2010 – released a year ago – has been downloaded over 7 million times (more than 4 million extension downloads) !   Main point: when we release a new capability into Visual Studio it automatically gets large adoption !   (story about the ISC developers)
  • 30. Microsoft and GPUs The volume business….
  • 31. GPU Hardware Evolution Year Version Defining Feature 1996 DirectX3 Hardware rasterization 1997 DirectX5 2 Shading options to select 1998 DirectX6 Multi-texture operations 1999 DirectX7 Vertex Processing in hardware 2000 DirectX8 Programmable Shaders: Vertex and Pixel 2002 DirectX9 High Level Shading Language, 32 instr 2003 DirectX9c 1000s of instructions per shader 2006 DirectX10 Unified Shaders: consistent shader models 2009 DirectX11 Compute Shader: explicit SIMD, random I/O
  • 32.
  • 33. The GPGPU Software Stack High level tools and !  Windows has broad libraries support at all levels: PGI “x86 CUDA”, CAPS, Culatools, Volara, • Supports all HW Acceleware • Each of CUDA, OpenCL and Low Level Programming DirectCompute CUDA, OpenCL, DirectCompute • Almost all high level tools and libraries Hardware GPU: AMD & NVIDIA Mullticore x86: AMD & Intel
  • 34. DirectCompute   !  What  is  DirectCompute?   • Microso3’s  GPGPU  Programming  Solu<on   • API  of  the  DirectX  Family   • Component  of  the  Direct3D  API   !  Why  Use  DirectCompute  Over  Other  APIs?   • Interoperability  with  rest  of  2D,  3D,  Video  rendering  APIs   (display  computed  results)   • Cross-­‐hardware  compa<bility   • Feature  compa<bility  guarantees   • Access  to  fixed-­‐func<on  hardware   !  Used  extensively  by  the  gaming  community   http://msdn.microsoft.com/directx
  • 35. GPGPU Development on Windows !  Choice: CUDA, OpenCL or DirectCompute !  Tools and libraries; Nsight and Visual Studio, PGI, CAPS, MATLAB, Jacket, PyCUDA, Quantifi, CUDA.NET, Culatools, NAG, Scicomp… many others !  NVIDIA reports that over 80% of CUDA SDK downloads are for Windows
  • 36. Microsoft and NVIDIA NVIDIA’s Parallel Nsight is integrated with Microsoft’s Visual Studio
  • 37. MATLAB Computer Cluster Desktop Computer MATLAB Distributed Computing Server Parallel Computing Toolbox Windows HPC Server Workers
  • 38.
  • 39. Cluster HPC ISV / OSS Excel MPI SOA Applications Applications HPC Middleware Pack SOA HPC Edition Operating Systems On Premise Cluster Computing   *Note that in SP1 support for MPI applications on Azure does not exist.
  • 40. Performance Parity Between Linux and Windows 1 Million active Cells, 1000 wells, Blackoil 5500 5000 Elapsed Time [secs] 4500 4000 3500 3000 2500 2000 1500 1000 500 Cores 1 2 4 6 8 16 24 32 48 RedHat 5 U3 5200.43 3385.17 3095.72 2281.25 1790.59 1014.42 776.71 638.43 621.42 Win HPC R2 SP1 5404.38 3298.55 3175.9 2171.37 1736.11 992.82 745.43 610.88 549.74 Make your choice based on features and TCO…
  • 41. NEW
  • 42. !   Connects to the cluster as a SOA client Excel SOA Client !   VSTO code in workbook calls out to SOA Service !   Input and output managed by Excel developer !   Run multiple instances of Excel 2010 on an HPC Cluster Excel Workbook on !   Each instance runs an iteration of the same workbook the Cluster !   Can be launched from Excel 2010 or a Windows program NEW !   Excel Dialog Suppression !   Run User Defined Functions in parallel on a cluster !   Excel 2010 includes a new API and options for HPC Excel UDF on the cluster Cluster !   Support for .XLL files developed through Excel SDK NEW !   Easy to develop on a desktop and then deploy to a cluster
  • 43. !   Use Azure servers to run HPC compute Jobs !   Can be used to “burst-out” to the cloud to handle peak demand !   Can create clusters that include dedicated on-premise servers, non-dedicated workstations and shared Azure servers !  Jobs can run unchanged across all 3 types of compute nodes (no support for MPI in SP1) !  Azure nodes are added to cluster using the Administration console (just like Workstation nodes) HPC Clients Azure Head & Broker Nodes Jobs Requests Azure Gateway
  • 44. Compute Nodes On-Premise and in Azure Simultaneously HPC Head Node Desktops • “Burst” into cloud on- demand while keeping control over data and corporate policies Broker Node On-premise Compute Nodes • Pay only for what you use • A stepping stone to hybrid Azure and public clouds. • Dynamically adjust how Azure much runs on-premise and in the cloud Compute Proxies Compute Instances
  • 45. Parallel Development “Combined with Intel Parallel Studio, I think it is reasonable to say that Windows has the richest and most complete set of tools for multicore programming”. -- James Reinders, Intel, 12-April-2010
  • 46. Solution Begins with DEVELOPERS Make it easier to express and manage the correctness, efficiency and maintainability of parallelism on Microsoft platforms for developers of all skill levels Enable developers to Simplify the express parallelism process of easily and focus on designing and Improve the testing parallel the problem to be efficiency and solved applications scalability of parallel applications
  • 47. Visual Studio 2010 Tools, Programming Models, Runtimes Tools Programming Models Parallel LINQ Parallel Parallel Agents Debugger Task Parallel Pattern Tool Library Library Library Data Structures Data Structures Windows Visual Studio .NET Framework 4 Visual C++ 10 Concurrency Runtime IDE Profiler ThreadPool Concurrenc Task Scheduler y Task Scheduler Analysis Resource Manager Resource Manager Operating UMS System Windows Threads Threads Managed Native Tooling
  • 48.
  • 49.
  • 50. World’s Fastest House Construction Three and a Half Hours
  • 51.
  • 52.
  • 53. http://www.microsoft.com/ hpc David Rich darich at microsoft.com
  • 54. © 2010 Microsoft Corporation. All rights reserved. Microsoft, Windows, Windows Vista and other product names are or may be registered trademarks and/or trademarks in the U.S. and/or other countries. The information herein is for informational purposes only and represents the current view of Microsoft Corporation as of the date of this presentation. Because Microsoft must respond to changing market conditions, it should not be interpreted to be a commitment on the part of Microsoft, and Microsoft cannot guarantee the accuracy of any information provided after the date of this presentation. MICROSOFT MAKES NO WARRANTIES, EXPRESS, IMPLIED OR STATUTORY, AS TO THE INFORMATION IN THIS PRESENTATION.