SlideShare une entreprise Scribd logo
1  sur  76
Télécharger pour lire hors ligne
Sharan Kalwani 
/ sharan.kalwani@acm.org 
www.linkedin.com/sharankalwani 
1
Outline 
o History of Supercomputing * 
o Technologies 
o Modern Day HPC: 
o Current State of the Art 
o Peering beyond the Horizon: 
o Next Set of technologies 
* aka High Performance Computing (HPC)
History 
 Computing Demand: 
 driven by needs far beyond contemporary capability 
 Early adopters: (1970s) 
LANL (Los Alamos National Lab) and 
NCAR (National Center for Atmospheric Research) 
 Characteristics: domain specific needs 
 Features: High Speed Calculations: PDE, Matrices 
 1972: Seymour Cray (CDC, Cray Research Inc.) 
 1st Model Cray-1
History 
 Cray-1 Characteristics: (1975-1976) 
 64 bit word length 
 12.5 nanosecond clock speed 
 80 MHz 
 “original” RISC 
 1 clock == 1 instruction 
 Vector instruction set, true multiplier effect, single instructions, multiple data 
 Matrix operations, pipelining, 
 included add+multiply! 
 memory <> processor balance 
 Cray-1, Cray-XMP, Cray-YMP, Cray-2, Cray 3
History 
 Enter the domain of MPP 
 Massively parallel processors 
 Introduction of Torus architectures 
 Seen these days in some offerings 
 Cray T3 D/E/F….(1st machine to break 
1,000,000,000 calculations/sec barrier)
Cray T3 architecture (logical) circa 1993, looks a lot like a cluster, eh?
Hardware Contributions: _Phase_ 1 
 Profusion of technologies: 
 RISC inspiration (1 clock cycle → 1 instruction) 
 Solid State Disk – recognized the need for keeping CPU busy all the time 
 multi-core software – coherence + synchronization 
 De-coupling of I/O from compute 
 Massive memory 
 I/O technologies – HiPPI (high speed parallel interface) 
 Visualization driver 
 Chip set design, ECL -> CMOS integration 
 Parallel processing software foundation -> MPI
Solid State Disk (grandpa USB stick) 
 The first CRAY X-MP system had SSD in 1982. 
 Designed for nearly immediate reading and writing of very large data files. 
 Data transfer rates of up to 1.250 GBytes / second, 
 Far exceeding *any* other data transfer devices in its time. 
 SSDs offered in sizes of 64, 128, or 256 million bytes of storage. 
 The hole in the cabinet was to attach a very high speed (VHISP) data channel to an SSD. 
 Link referred to as the "skyway." 
 Via software, the SSD is logically accessed as a disk unit. 
SSD driver ~ 1200 lines of C code!
History Marches on…. 
 Battle of technologies: 
 Silicon v. Gallium Arsenide 
 Vector v. Killer Micros 
 Accelerated Strategic Computing Initiative (mid 90s) ASCI project changed directions for everyone
Speed 
 Everybody was focused on clock speed 
 Or Floating Point Operations (FLOPS/sec) 
 @12.5 ns clock speed  80 million Flops/sec (peak) 
 Leading to the famous Macho FLOPS race: 
 USA v Japan (90s) 
Megaflops  Gigaflops (1000 MF) 
Gigaflops  Teraflops (1000 GF) 
Teraflops  Petaflops (1000 TF) 
In 2018 the industry expects an ExaFlop machine!
Speed 
 First GigaFlop/sec* System 
 Cray YMP and Cray-2 
 First TeraFlop/sec System 
 Sandia National Lab ASCI “Red” (Intel) 
 First PetaFlop/sec System 
LANL “RoadRunner” (IBM) 
 First ExaFlop/sec System 
??? 
* SUSTAINED!
Is anyone keeping score? 
 Birth of the Top500 list 
 1993 – Dongarra, Strohmaier , Meuer & Simon 
 Linear Algebra Package (LINPAK) basis 
 Offshoots: 
 Green500 (power efficiency) 
 Graph500 (search oriented – little to no floating point computation) 
 @SC13 a new replacement metric has been proposed
Is anyone keeping score? 
We will return to this …..
Cluster Growth propelled by HPC
Linux Growth propelled by HPC
Track Record of Linux versions in HPC 
•See also Linux Foundation report 
•http://www.linuxfoundation.org/publications/linux- foundation/top500_report_2013
HPC top500 - factoids 
•Current #1 system has 3,120,000 cores 
–Located in China, called Tianhe-2 “Milky Way” 
–Peak speed of 33.9 PetaFLops/second (quadrillions of calculations per second) 
–Needs 17.8 MW of power 
•Current #2 system @ ORNL (US Government DoE) in Tennessee 
–Has 560,640 cores, called Titan 
–Peak speed of 17.6 PetaFlops/second 
–Needs 8.21 MW of power
HPC top500 - factoids 
•Tianhe-2
HPC top500 - factoids 
•Titan (http://www.ornl.gov/info/press_releases/get_press_release.cfm?ReleaseNumber=mr20121029-00) 
•
HPC top500 - factoids 
•Titan
Treemap of countries in HPC
Operating Systems: History 
 Early HPC OS: 
 tiny assembled loaders 
 CTSS (Cray Time Sharing Systems) - LTSS 
 CRAY Operating Systems (COS) 
 CRAY UNIX ((UNICOS) 
 mk/Kernel – CHORUS 
 Beowulf cluster – Linux appears
Linux Contributions: History 
 Linux – attack of the killer micros, 1992
Linux Contributions: History 
 NOW – Network of workstations, 1993
Linux Contributions: History 1993-1994 
 133 nodes – Stone Supercomputer 
 First Beowulf cluster 
 Concept pioneered at NASA/Caltech 
 Thomas Sterling and Donald Becker
Linux Contributions: History 1993-1994 
•Beowulf
Linux Contributions: History 1993-1994 
•Beowulf
Linux Contributions: History 1993-1994 
•Beowulf 
•NASA 
•LSU 
•Indiana University 
THOMAS STERLING
Linux Contributions: History 
•Beowulf Components: 
–Parallel Virtual Machine (PVM) – U Tennesse 
–Message Passing Interface (MPI) – several folks 
–Jack Dongarra,Tony Hey and David Walker 
–Support of NSF and ARPA 
–Today we have the MPI Forum 
–MPI 2 and now MPI 3 
–OpenMPI, MPICH, etc 
–Future Pthreads and OpenMP,
HPC and Linux 
•Beowulf Attributes (or cluster features): 
•Open Source 
•Low Cost 
•Elasticity 
•Equal Spread of work (seeds of cloud computing here!!) 
•These days the Linux kernel can handle 64 cores! HPC pushes this limit even further….
HPC and Linux pushing the boundaries 
•File systems: 
–Large number of high performance file systems 
–Lustre, now in version 2.5 
–Beats HDFS several times over!! 
–You can host HDFS over many HPC filesystems for massive gains
Typical Stack 
Pick Distro – Linux based (usually Enterprise class) 
Hardware
Hardware Contributions: _Phase_ 2 
 Profusion of technologies: 
 In-memory processing, many HPC sites implemented this 
 1992 built special systems for the use in cryptography using these technigques 
 Graph traversal systems – now available as appliances by HPC vendors 
 Massive memory : single memory systems over several TB in size 
 Infiniband interconnects: hitting 100 Gbits/sec switches you can buy them now 
 Parallel processing software foundation -> replacements for MPI stack being worked on
Modern Day HPC 
•Building the ExaScale machine: 
–Exascale is 1 quintillion calculations/second 
–1000x Petaflops/sec 
–Also Known as 10^18 (hence 2018 projections) 
–1, 000, 000, 000, 000, 000, 000 floating point calculations/second (sustained) 
–How to feed this monster?
Modern Day HPC 
•Solutions for the ExaScale monster: 
•Inevitably Big Data community should watch/support/benefit issues we are tackling now: 
–Memory matters! 
–Resiliency in software 
–Robustness in hardware 
–Co-Design critical 
–Power Consumption and Cooling (estimate several megawatts w/ present day approaches) 
–Utterly new architectures needed
Applications: What did we use all this for? 
Weather 
Automotive and Aerospace Design 
Traditional Sciences 
Energy (Nuclear, Oil & Gas) 
Bioinformatics 
Cryptography 
 and……big data
Applications: traditional HPC 
Automotive & (similar) Aerospace Design 
o Car Crash Analysis – prime usage, 50% 
o Each physical crash test costs $0.5 million 
o Virtual Prototype test - $1000 (or less)
Applications: The real deal vs. HPC 
•NHTSA requires physical validation 
•Before total crash tests cost a total of $100 million/year 
•Limited to a small suite: 12 tests 
•Today we can do over 140+ different tests (for each vehicle) and with: 
–Less cost (we instead increased the # of tests!) 
–Faster response (5 years v 12 months) 
–Many more design iterations (Hundreds v 10)
HPC for weather forecasting
Whither HPC and the Cloud?
HPC for Crisis Assist
Technology March! Or why simulation matters? 
•Increasing resolving power - greater fidelity problem 
•Decreasing product design turnaround times 
•Increase cost-effectiveness relative to experiment & observation 
•Reducing uncertainty 
•Ramping up the ease of use by non-experts 
•Powerful tool in resolving scientific questions, engineering designs, and policy support 
•Co-execution environments for simulation, large-scale data enabled science, and scientific visualization 
•Simple: Better Answers thus delivering an….. 
–Attractiveness to the creative and entrepreneurial classes 
–Straightforward case for national economic competitiveness!!!
We need more HPC because….
What about costs????
What about costs????
What about costs????
HPC is indispensable! 
•Establish Capability 
•Enable Adding of Complexity 
•Gain a real and better Understanding 
•And do not forget all that data! 
•How do we tie it in?......
Approaching the eXtreme Scale 
•Current paradigm: Simulation lots of equations which mimic or model actual situations “Third” Paradigm 
•------------------------------------------------------------------------------------------- 
•Operate without models (Big Data) “Fourth” Paradigm
Operate without models (Big Data) 
•BEFORE….. * NOW/FUTURE…. 
Models/Theory 
Models/Theory 
DATA
Best Example….. tell us what we do not know! 
•Recent Success: 
•Solar observations (actual data) 
•Unknown Surface Perturbations or Energy 
•Could not be explained by all classical models 
•Resorted to automated machine learning driven alternate search 
•Answer: Solar Earthquakes and Thunderclaps, classic acoustic signature! 
•New profession: Solar Seismologists!
Trend began a decade+ ago…. http://research.microsoft.com/en- us/collaboration/fourthparadigm/4th_paradigm_book_complete_lr.pdf
Everyone is seriously interested….. http://science.energy.gov/~/media/ascr/ascac/pdf/reports/exascale_subcommittee_report.pdf.
A Peek at the Future……. 
•Yes…we should definitely care 
•The bigger and more relevant questions are: 
–What architecture? What programming model? 
–Power consumption will dominate 
•Currently 4 approaches: 
–Stay the course ? Not!! 
–All GPGPU based? 
–ARM based? 
–Quantum Computing ??
GPGPU perspective…. 
G1 
G2 
G3 
G4 
8-Cores 
8-Cores 
16-Core Server Node 
Multi-GPU Acceleration of a 16-Core ANSYS Fluent Simulation of External Aero Xeon E5-2667 CPUs + Tesla K20X GPUs 
2.9X Solver Speedup 
CPU Configuration 
CPU + GPU Configuration 
Click to Launch Movie
A Peek at the Future……. 
–GPGPU 
GDDR 
GDDR 
DDR 
DDR 
GPU 
I/O Hub 
PCI-Express 
CPU 
Cache 
1 
2 
3
A Peek at the Future……. 
–GPGPU based? 
–http://www.anl.gov/events/overview-nvidia-exascale- processor-architecture-co-design-philosophy-and- application-results 
–Echelon 
–DragonFly 
–http://www.nvidia.com/content/PDF/sc_2010/theater/Dally_SC10.pdf
A Peek at the Future…….
A Peek at the Future……. 
–ARM or Pi based? 
–http://coen.boisestate.edu/ece/files/2013/05/Creating.a. Raspberry.Pi-Based.Beowulf.Cluster_v2.pdf
Quantum Computing……. 
•D-WAVE systems installed at NASA Ames lab 
•Uses a special chip, 512 bit “Vesuvius” 
•Uses 12 KW of power 
•Cooled to 0.02 Degrees K (100 times colder than outer space) 
•RF shielding
Quantum Computing……. 
•Shor’s Integer Factorization Algorithm 
•Problem: Given a composite n-bit integer, find a nontrivial factor. 
–Best-known deterministic algorithm on a classical computer has time complexity exp(O( n1/3log2/3 n)). 
•A quantum computer can solve this problem in O( n3 ) operations. 
Peter Shor 
Algorithms for Quantum Computation: Discrete Logarithms and Factoring 
Proc. 35thAnnual Symposium on Foundations of Computer Science, 1994, pp. 124-134
Quantum Computing……. 
•Classical: number field sieve 
–Time complexity: exp(O(n1/3 log2/3 n)) 
–Time for 512-bit number: 8400 MIPS years 
–Time for 1024-bit number: 1.6 billion times longer 
•Quantum: Shor’s algorithm 
–Time complexity: O(n3) 
–Time for 512-bit number: 3.5 hours 
–Time for 1024-bit number: 31 hours 
•(assuming a 1 GHz quantum machine) 
See M. Oskin, F. Chong, I. Chuang 
A Practical Architecture for Reliable Quantum Computers 
IEEE Computer, 2002, pp. 79-87
What I will be looking into……. 
•Julia 
•Programming Environment which combines *all* the elements of: 
–R (express data handling) 
–Scientific and Engineering process (e.g. MATLAB like) 
–Parallel processing and distributed computing functional approaches (similar to Scala, Erlang and others) 
–Python and other integration packages already there 
–Happy marriage of several arenas 
– Now in early release 
•Feel free to contact or follow up with me on this
SUMMARY: Core Competencies Across HPC 
Core Competencies 
Extreme scale 
Architecture 
Compute 
I/O 
Memory 
Storage/data management 
Tera, Peta, Exabytes…. 
Visualization and analytics 
Fast fabrics 
Future architectural direction 
Parallelism to extreme parallelism 
Multi core 
Programming models 
Big Data 
Models, applications, applied analytics 
Structured, unstructured data types
The need for a new discipline: HPC experts + Domain Expertise == Simulation.Specialists 
Core Competencies 
Where would this Computational Specialist work? 
Extreme scale 
Architecture 
Compute 
I/O 
Memory 
Storage/data management 
Tera, Peta, Exabytes…. 
Visualization and analytics 
Fast fabrics 
Future architectural direction 
Parallelism to extreme parallelism 
Multi core 
Programming models 
Big Data 
Models, applications, applied analytics 
Structured, unstructured data types 
National security 
Fraud detection 
Grand challenge science 
Physics, Chemistry, Biology, Weather/climate, energy etc. 
Bio/life sciences 
Healthcare 
Energy/Geophysics 
Financial modeling, high frequency and algorithmic trading 
Entertainment/media 
Auto/aero/mfg. 
Consumer 
Electronics 
Risk informatics: insurance, global, financial, medical etc. 
Optimization models 
Discovery analytics
On a lighter note…..
On a lighter note…..
On a lighter note…..
On a lighter note…..
On a lighter note…..
On a lighter note…..
Further reading.…..
Further reading.…..
Further reading.….. 
Currently reviewing
Further reading.…..
Innovative uses of HPC (LinkedIn.com)
Thank you….. 
•Email: sharan dot kalwani at acm dot org

Contenu connexe

Tendances

Mellanox Announces HDR 200 Gb/s InfiniBand Solutions
Mellanox Announces HDR 200 Gb/s InfiniBand SolutionsMellanox Announces HDR 200 Gb/s InfiniBand Solutions
Mellanox Announces HDR 200 Gb/s InfiniBand Solutionsinside-BigData.com
 
High performance computing
High performance computingHigh performance computing
High performance computingGuy Tel-Zur
 
UberCloud HPC Experiment Introduction for Beginners
UberCloud HPC Experiment Introduction for BeginnersUberCloud HPC Experiment Introduction for Beginners
UberCloud HPC Experiment Introduction for Beginnershpcexperiment
 
Tesla personal super computer
Tesla personal super computerTesla personal super computer
Tesla personal super computerPriya Manik
 
From Rack scale computers to Warehouse scale computers
From Rack scale computers to Warehouse scale computersFrom Rack scale computers to Warehouse scale computers
From Rack scale computers to Warehouse scale computersRyousei Takano
 
How to Achieve High-Performance, Scalable and Distributed DNN Training on Mod...
How to Achieve High-Performance, Scalable and Distributed DNN Training on Mod...How to Achieve High-Performance, Scalable and Distributed DNN Training on Mod...
How to Achieve High-Performance, Scalable and Distributed DNN Training on Mod...inside-BigData.com
 
RISC-V and OpenPOWER open-ISA and open-HW - a swiss army knife for HPC
RISC-V  and OpenPOWER open-ISA and open-HW - a swiss army knife for HPCRISC-V  and OpenPOWER open-ISA and open-HW - a swiss army knife for HPC
RISC-V and OpenPOWER open-ISA and open-HW - a swiss army knife for HPCGanesan Narayanasamy
 
Exploring the Performance Impact of Virtualization on an HPC Cloud
Exploring the Performance Impact of Virtualization on an HPC CloudExploring the Performance Impact of Virtualization on an HPC Cloud
Exploring the Performance Impact of Virtualization on an HPC CloudRyousei Takano
 
TAU E4S ON OpenPOWER /POWER9 platform
TAU E4S ON OpenPOWER /POWER9 platformTAU E4S ON OpenPOWER /POWER9 platform
TAU E4S ON OpenPOWER /POWER9 platformGanesan Narayanasamy
 
Introduction to High-Performance Computing (HPC) Containers and Singularity*
Introduction to High-Performance Computing (HPC) Containers and Singularity*Introduction to High-Performance Computing (HPC) Containers and Singularity*
Introduction to High-Performance Computing (HPC) Containers and Singularity*Intel® Software
 

Tendances (20)

Exascale Capabl
Exascale CapablExascale Capabl
Exascale Capabl
 
Mellanox Announces HDR 200 Gb/s InfiniBand Solutions
Mellanox Announces HDR 200 Gb/s InfiniBand SolutionsMellanox Announces HDR 200 Gb/s InfiniBand Solutions
Mellanox Announces HDR 200 Gb/s InfiniBand Solutions
 
High performance computing
High performance computingHigh performance computing
High performance computing
 
UberCloud HPC Experiment Introduction for Beginners
UberCloud HPC Experiment Introduction for BeginnersUberCloud HPC Experiment Introduction for Beginners
UberCloud HPC Experiment Introduction for Beginners
 
Current Trends in HPC
Current Trends in HPCCurrent Trends in HPC
Current Trends in HPC
 
Tesla personal super computer
Tesla personal super computerTesla personal super computer
Tesla personal super computer
 
From Rack scale computers to Warehouse scale computers
From Rack scale computers to Warehouse scale computersFrom Rack scale computers to Warehouse scale computers
From Rack scale computers to Warehouse scale computers
 
POWER10 innovations for HPC
POWER10 innovations for HPCPOWER10 innovations for HPC
POWER10 innovations for HPC
 
World’s Fastest Supercomputer | Tianhe - 2
World’s Fastest Supercomputer |  Tianhe - 2World’s Fastest Supercomputer |  Tianhe - 2
World’s Fastest Supercomputer | Tianhe - 2
 
Super computer
Super computerSuper computer
Super computer
 
Supercomputers
SupercomputersSupercomputers
Supercomputers
 
How to Achieve High-Performance, Scalable and Distributed DNN Training on Mod...
How to Achieve High-Performance, Scalable and Distributed DNN Training on Mod...How to Achieve High-Performance, Scalable and Distributed DNN Training on Mod...
How to Achieve High-Performance, Scalable and Distributed DNN Training on Mod...
 
RISC-V and OpenPOWER open-ISA and open-HW - a swiss army knife for HPC
RISC-V  and OpenPOWER open-ISA and open-HW - a swiss army knife for HPCRISC-V  and OpenPOWER open-ISA and open-HW - a swiss army knife for HPC
RISC-V and OpenPOWER open-ISA and open-HW - a swiss army knife for HPC
 
Exploring the Performance Impact of Virtualization on an HPC Cloud
Exploring the Performance Impact of Virtualization on an HPC CloudExploring the Performance Impact of Virtualization on an HPC Cloud
Exploring the Performance Impact of Virtualization on an HPC Cloud
 
IBM HPC Transformation with AI
IBM HPC Transformation with AI IBM HPC Transformation with AI
IBM HPC Transformation with AI
 
Param yuva ii
Param yuva iiParam yuva ii
Param yuva ii
 
TAU E4S ON OpenPOWER /POWER9 platform
TAU E4S ON OpenPOWER /POWER9 platformTAU E4S ON OpenPOWER /POWER9 platform
TAU E4S ON OpenPOWER /POWER9 platform
 
Introduction to High-Performance Computing (HPC) Containers and Singularity*
Introduction to High-Performance Computing (HPC) Containers and Singularity*Introduction to High-Performance Computing (HPC) Containers and Singularity*
Introduction to High-Performance Computing (HPC) Containers and Singularity*
 
Super Computer
Super ComputerSuper Computer
Super Computer
 
Sierra overview
Sierra overviewSierra overview
Sierra overview
 

Similaire à History and Future of Supercomputing Technologies

Stories About Spark, HPC and Barcelona by Jordi Torres
Stories About Spark, HPC and Barcelona by Jordi TorresStories About Spark, HPC and Barcelona by Jordi Torres
Stories About Spark, HPC and Barcelona by Jordi TorresSpark Summit
 
Connection Machine
Connection MachineConnection Machine
Connection Machinebutest
 
Designing HPC & Deep Learning Middleware for Exascale Systems
Designing HPC & Deep Learning Middleware for Exascale SystemsDesigning HPC & Deep Learning Middleware for Exascale Systems
Designing HPC & Deep Learning Middleware for Exascale Systemsinside-BigData.com
 
Designing HPC, Deep Learning, and Cloud Middleware for Exascale Systems
Designing HPC, Deep Learning, and Cloud Middleware for Exascale SystemsDesigning HPC, Deep Learning, and Cloud Middleware for Exascale Systems
Designing HPC, Deep Learning, and Cloud Middleware for Exascale Systemsinside-BigData.com
 
The Parallel Computing Revolution Is Only Half Over
The Parallel Computing Revolution Is Only Half OverThe Parallel Computing Revolution Is Only Half Over
The Parallel Computing Revolution Is Only Half Overinside-BigData.com
 
HPC Cluster Computing from 64 to 156,000 Cores 
HPC Cluster Computing from 64 to 156,000 Cores HPC Cluster Computing from 64 to 156,000 Cores 
HPC Cluster Computing from 64 to 156,000 Cores inside-BigData.com
 
How HPC and large-scale data analytics are transforming experimental science
How HPC and large-scale data analytics are transforming experimental scienceHow HPC and large-scale data analytics are transforming experimental science
How HPC and large-scale data analytics are transforming experimental scienceinside-BigData.com
 
Cluster Tutorial
Cluster TutorialCluster Tutorial
Cluster Tutorialcybercbm
 
OpenPOWER Acceleration of HPCC Systems
OpenPOWER Acceleration of HPCC SystemsOpenPOWER Acceleration of HPCC Systems
OpenPOWER Acceleration of HPCC SystemsHPCC Systems
 
Update on Trinity System Procurement and Plans
Update on Trinity System Procurement and PlansUpdate on Trinity System Procurement and Plans
Update on Trinity System Procurement and Plansinside-BigData.com
 
Programmable Exascale Supercomputer
Programmable Exascale SupercomputerProgrammable Exascale Supercomputer
Programmable Exascale SupercomputerSagar Dolas
 
Real time machine learning proposers day v3
Real time machine learning proposers day v3Real time machine learning proposers day v3
Real time machine learning proposers day v3mustafa sarac
 
FPGAs as Components in Heterogeneous HPC Systems (paraFPGA 2015 keynote)
FPGAs as Components in Heterogeneous HPC Systems (paraFPGA 2015 keynote) FPGAs as Components in Heterogeneous HPC Systems (paraFPGA 2015 keynote)
FPGAs as Components in Heterogeneous HPC Systems (paraFPGA 2015 keynote) Wim Vanderbauwhede
 
How to Design Scalable HPC, Deep Learning, and Cloud Middleware for Exascale ...
How to Design Scalable HPC, Deep Learning, and Cloud Middleware for Exascale ...How to Design Scalable HPC, Deep Learning, and Cloud Middleware for Exascale ...
How to Design Scalable HPC, Deep Learning, and Cloud Middleware for Exascale ...inside-BigData.com
 
A Library for Emerging High-Performance Computing Clusters
A Library for Emerging High-Performance Computing ClustersA Library for Emerging High-Performance Computing Clusters
A Library for Emerging High-Performance Computing ClustersIntel® Software
 
Future of microprocessor in applied physics
Future of microprocessor in applied physicsFuture of microprocessor in applied physics
Future of microprocessor in applied physicsRakeshPatil2528
 
Accelerating TensorFlow with RDMA for high-performance deep learning
Accelerating TensorFlow with RDMA for high-performance deep learningAccelerating TensorFlow with RDMA for high-performance deep learning
Accelerating TensorFlow with RDMA for high-performance deep learningDataWorks Summit
 

Similaire à History and Future of Supercomputing Technologies (20)

Stories About Spark, HPC and Barcelona by Jordi Torres
Stories About Spark, HPC and Barcelona by Jordi TorresStories About Spark, HPC and Barcelona by Jordi Torres
Stories About Spark, HPC and Barcelona by Jordi Torres
 
Connection Machine
Connection MachineConnection Machine
Connection Machine
 
Designing HPC & Deep Learning Middleware for Exascale Systems
Designing HPC & Deep Learning Middleware for Exascale SystemsDesigning HPC & Deep Learning Middleware for Exascale Systems
Designing HPC & Deep Learning Middleware for Exascale Systems
 
Designing HPC, Deep Learning, and Cloud Middleware for Exascale Systems
Designing HPC, Deep Learning, and Cloud Middleware for Exascale SystemsDesigning HPC, Deep Learning, and Cloud Middleware for Exascale Systems
Designing HPC, Deep Learning, and Cloud Middleware for Exascale Systems
 
The Parallel Computing Revolution Is Only Half Over
The Parallel Computing Revolution Is Only Half OverThe Parallel Computing Revolution Is Only Half Over
The Parallel Computing Revolution Is Only Half Over
 
HPC Cluster Computing from 64 to 156,000 Cores 
HPC Cluster Computing from 64 to 156,000 Cores HPC Cluster Computing from 64 to 156,000 Cores 
HPC Cluster Computing from 64 to 156,000 Cores 
 
How HPC and large-scale data analytics are transforming experimental science
How HPC and large-scale data analytics are transforming experimental scienceHow HPC and large-scale data analytics are transforming experimental science
How HPC and large-scale data analytics are transforming experimental science
 
Cluster Tutorial
Cluster TutorialCluster Tutorial
Cluster Tutorial
 
OpenPOWER Acceleration of HPCC Systems
OpenPOWER Acceleration of HPCC SystemsOpenPOWER Acceleration of HPCC Systems
OpenPOWER Acceleration of HPCC Systems
 
supercomputer
supercomputersupercomputer
supercomputer
 
Update on Trinity System Procurement and Plans
Update on Trinity System Procurement and PlansUpdate on Trinity System Procurement and Plans
Update on Trinity System Procurement and Plans
 
Programmable Exascale Supercomputer
Programmable Exascale SupercomputerProgrammable Exascale Supercomputer
Programmable Exascale Supercomputer
 
Real time machine learning proposers day v3
Real time machine learning proposers day v3Real time machine learning proposers day v3
Real time machine learning proposers day v3
 
FPGAs as Components in Heterogeneous HPC Systems (paraFPGA 2015 keynote)
FPGAs as Components in Heterogeneous HPC Systems (paraFPGA 2015 keynote) FPGAs as Components in Heterogeneous HPC Systems (paraFPGA 2015 keynote)
FPGAs as Components in Heterogeneous HPC Systems (paraFPGA 2015 keynote)
 
How to Design Scalable HPC, Deep Learning, and Cloud Middleware for Exascale ...
How to Design Scalable HPC, Deep Learning, and Cloud Middleware for Exascale ...How to Design Scalable HPC, Deep Learning, and Cloud Middleware for Exascale ...
How to Design Scalable HPC, Deep Learning, and Cloud Middleware for Exascale ...
 
A Library for Emerging High-Performance Computing Clusters
A Library for Emerging High-Performance Computing ClustersA Library for Emerging High-Performance Computing Clusters
A Library for Emerging High-Performance Computing Clusters
 
Nae
NaeNae
Nae
 
Nae
NaeNae
Nae
 
Future of microprocessor in applied physics
Future of microprocessor in applied physicsFuture of microprocessor in applied physics
Future of microprocessor in applied physics
 
Accelerating TensorFlow with RDMA for high-performance deep learning
Accelerating TensorFlow with RDMA for high-performance deep learningAccelerating TensorFlow with RDMA for high-performance deep learning
Accelerating TensorFlow with RDMA for high-performance deep learning
 

Plus de BigDataEverywhere

Big Data Everywhere Chicago: Apache Spark Plus Many Other Frameworks -- How S...
Big Data Everywhere Chicago: Apache Spark Plus Many Other Frameworks -- How S...Big Data Everywhere Chicago: Apache Spark Plus Many Other Frameworks -- How S...
Big Data Everywhere Chicago: Apache Spark Plus Many Other Frameworks -- How S...BigDataEverywhere
 
Big Data Everywhere Chicago: Getting Real with the MapR Platform (MapR)
Big Data Everywhere Chicago: Getting Real with the MapR Platform (MapR)Big Data Everywhere Chicago: Getting Real with the MapR Platform (MapR)
Big Data Everywhere Chicago: Getting Real with the MapR Platform (MapR)BigDataEverywhere
 
Big Data Everywhere Chicago: Leading a Healthcare Company to the Big Data Pro...
Big Data Everywhere Chicago: Leading a Healthcare Company to the Big Data Pro...Big Data Everywhere Chicago: Leading a Healthcare Company to the Big Data Pro...
Big Data Everywhere Chicago: Leading a Healthcare Company to the Big Data Pro...BigDataEverywhere
 
Big Data Everywhere Chicago: The Big Data Imperative -- Discovering & Protect...
Big Data Everywhere Chicago: The Big Data Imperative -- Discovering & Protect...Big Data Everywhere Chicago: The Big Data Imperative -- Discovering & Protect...
Big Data Everywhere Chicago: The Big Data Imperative -- Discovering & Protect...BigDataEverywhere
 
Big Data Everywhere Chicago: Unleash the Power of HBase Shell (Conversant)
Big Data Everywhere Chicago: Unleash the Power of HBase Shell (Conversant) Big Data Everywhere Chicago: Unleash the Power of HBase Shell (Conversant)
Big Data Everywhere Chicago: Unleash the Power of HBase Shell (Conversant) BigDataEverywhere
 
Big Data Everywhere Chicago: SQL on Hadoop
Big Data Everywhere Chicago: SQL on Hadoop Big Data Everywhere Chicago: SQL on Hadoop
Big Data Everywhere Chicago: SQL on Hadoop BigDataEverywhere
 
Big Data Everywhere Chicago: Platfora - Practices for Customer Analytics on H...
Big Data Everywhere Chicago: Platfora - Practices for Customer Analytics on H...Big Data Everywhere Chicago: Platfora - Practices for Customer Analytics on H...
Big Data Everywhere Chicago: Platfora - Practices for Customer Analytics on H...BigDataEverywhere
 

Plus de BigDataEverywhere (7)

Big Data Everywhere Chicago: Apache Spark Plus Many Other Frameworks -- How S...
Big Data Everywhere Chicago: Apache Spark Plus Many Other Frameworks -- How S...Big Data Everywhere Chicago: Apache Spark Plus Many Other Frameworks -- How S...
Big Data Everywhere Chicago: Apache Spark Plus Many Other Frameworks -- How S...
 
Big Data Everywhere Chicago: Getting Real with the MapR Platform (MapR)
Big Data Everywhere Chicago: Getting Real with the MapR Platform (MapR)Big Data Everywhere Chicago: Getting Real with the MapR Platform (MapR)
Big Data Everywhere Chicago: Getting Real with the MapR Platform (MapR)
 
Big Data Everywhere Chicago: Leading a Healthcare Company to the Big Data Pro...
Big Data Everywhere Chicago: Leading a Healthcare Company to the Big Data Pro...Big Data Everywhere Chicago: Leading a Healthcare Company to the Big Data Pro...
Big Data Everywhere Chicago: Leading a Healthcare Company to the Big Data Pro...
 
Big Data Everywhere Chicago: The Big Data Imperative -- Discovering & Protect...
Big Data Everywhere Chicago: The Big Data Imperative -- Discovering & Protect...Big Data Everywhere Chicago: The Big Data Imperative -- Discovering & Protect...
Big Data Everywhere Chicago: The Big Data Imperative -- Discovering & Protect...
 
Big Data Everywhere Chicago: Unleash the Power of HBase Shell (Conversant)
Big Data Everywhere Chicago: Unleash the Power of HBase Shell (Conversant) Big Data Everywhere Chicago: Unleash the Power of HBase Shell (Conversant)
Big Data Everywhere Chicago: Unleash the Power of HBase Shell (Conversant)
 
Big Data Everywhere Chicago: SQL on Hadoop
Big Data Everywhere Chicago: SQL on Hadoop Big Data Everywhere Chicago: SQL on Hadoop
Big Data Everywhere Chicago: SQL on Hadoop
 
Big Data Everywhere Chicago: Platfora - Practices for Customer Analytics on H...
Big Data Everywhere Chicago: Platfora - Practices for Customer Analytics on H...Big Data Everywhere Chicago: Platfora - Practices for Customer Analytics on H...
Big Data Everywhere Chicago: Platfora - Practices for Customer Analytics on H...
 

Dernier

SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024Lorenzo Miniero
 
Scanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsScanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsRizwan Syed
 
CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):comworks
 
Commit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easyCommit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easyAlfredo García Lavilla
 
Unleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubUnleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubKalema Edgar
 
TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024Lonnie McRorey
 
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024BookNet Canada
 
Search Engine Optimization SEO PDF for 2024.pdf
Search Engine Optimization SEO PDF for 2024.pdfSearch Engine Optimization SEO PDF for 2024.pdf
Search Engine Optimization SEO PDF for 2024.pdfRankYa
 
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks..."LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...Fwdays
 
How to write a Business Continuity Plan
How to write a Business Continuity PlanHow to write a Business Continuity Plan
How to write a Business Continuity PlanDatabarracks
 
Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024Enterprise Knowledge
 
From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .Alan Dix
 
What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024Stephanie Beckett
 
Dev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebDev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebUiPathCommunity
 
Take control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteTake control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteDianaGray10
 
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Mark Simos
 
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxMerck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxLoriGlavin3
 
The Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsThe Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsPixlogix Infotech
 
Artificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptxArtificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptxhariprasad279825
 

Dernier (20)

SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024
 
Scanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsScanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL Certs
 
CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):
 
Commit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easyCommit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easy
 
Unleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubUnleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding Club
 
TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024
 
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
 
Search Engine Optimization SEO PDF for 2024.pdf
Search Engine Optimization SEO PDF for 2024.pdfSearch Engine Optimization SEO PDF for 2024.pdf
Search Engine Optimization SEO PDF for 2024.pdf
 
DMCC Future of Trade Web3 - Special Edition
DMCC Future of Trade Web3 - Special EditionDMCC Future of Trade Web3 - Special Edition
DMCC Future of Trade Web3 - Special Edition
 
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks..."LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
 
How to write a Business Continuity Plan
How to write a Business Continuity PlanHow to write a Business Continuity Plan
How to write a Business Continuity Plan
 
Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024
 
From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .
 
What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024
 
Dev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebDev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio Web
 
Take control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteTake control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test Suite
 
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
 
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxMerck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
 
The Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsThe Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and Cons
 
Artificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptxArtificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptx
 

History and Future of Supercomputing Technologies

  • 1. Sharan Kalwani / sharan.kalwani@acm.org www.linkedin.com/sharankalwani 1
  • 2. Outline o History of Supercomputing * o Technologies o Modern Day HPC: o Current State of the Art o Peering beyond the Horizon: o Next Set of technologies * aka High Performance Computing (HPC)
  • 3. History  Computing Demand:  driven by needs far beyond contemporary capability  Early adopters: (1970s) LANL (Los Alamos National Lab) and NCAR (National Center for Atmospheric Research)  Characteristics: domain specific needs  Features: High Speed Calculations: PDE, Matrices  1972: Seymour Cray (CDC, Cray Research Inc.)  1st Model Cray-1
  • 4. History  Cray-1 Characteristics: (1975-1976)  64 bit word length  12.5 nanosecond clock speed  80 MHz  “original” RISC  1 clock == 1 instruction  Vector instruction set, true multiplier effect, single instructions, multiple data  Matrix operations, pipelining,  included add+multiply!  memory <> processor balance  Cray-1, Cray-XMP, Cray-YMP, Cray-2, Cray 3
  • 5. History  Enter the domain of MPP  Massively parallel processors  Introduction of Torus architectures  Seen these days in some offerings  Cray T3 D/E/F….(1st machine to break 1,000,000,000 calculations/sec barrier)
  • 6. Cray T3 architecture (logical) circa 1993, looks a lot like a cluster, eh?
  • 7. Hardware Contributions: _Phase_ 1  Profusion of technologies:  RISC inspiration (1 clock cycle → 1 instruction)  Solid State Disk – recognized the need for keeping CPU busy all the time  multi-core software – coherence + synchronization  De-coupling of I/O from compute  Massive memory  I/O technologies – HiPPI (high speed parallel interface)  Visualization driver  Chip set design, ECL -> CMOS integration  Parallel processing software foundation -> MPI
  • 8. Solid State Disk (grandpa USB stick)  The first CRAY X-MP system had SSD in 1982.  Designed for nearly immediate reading and writing of very large data files.  Data transfer rates of up to 1.250 GBytes / second,  Far exceeding *any* other data transfer devices in its time.  SSDs offered in sizes of 64, 128, or 256 million bytes of storage.  The hole in the cabinet was to attach a very high speed (VHISP) data channel to an SSD.  Link referred to as the "skyway."  Via software, the SSD is logically accessed as a disk unit. SSD driver ~ 1200 lines of C code!
  • 9. History Marches on….  Battle of technologies:  Silicon v. Gallium Arsenide  Vector v. Killer Micros  Accelerated Strategic Computing Initiative (mid 90s) ASCI project changed directions for everyone
  • 10. Speed  Everybody was focused on clock speed  Or Floating Point Operations (FLOPS/sec)  @12.5 ns clock speed  80 million Flops/sec (peak)  Leading to the famous Macho FLOPS race:  USA v Japan (90s) Megaflops  Gigaflops (1000 MF) Gigaflops  Teraflops (1000 GF) Teraflops  Petaflops (1000 TF) In 2018 the industry expects an ExaFlop machine!
  • 11. Speed  First GigaFlop/sec* System  Cray YMP and Cray-2  First TeraFlop/sec System  Sandia National Lab ASCI “Red” (Intel)  First PetaFlop/sec System LANL “RoadRunner” (IBM)  First ExaFlop/sec System ??? * SUSTAINED!
  • 12. Is anyone keeping score?  Birth of the Top500 list  1993 – Dongarra, Strohmaier , Meuer & Simon  Linear Algebra Package (LINPAK) basis  Offshoots:  Green500 (power efficiency)  Graph500 (search oriented – little to no floating point computation)  @SC13 a new replacement metric has been proposed
  • 13. Is anyone keeping score? We will return to this …..
  • 16. Track Record of Linux versions in HPC •See also Linux Foundation report •http://www.linuxfoundation.org/publications/linux- foundation/top500_report_2013
  • 17. HPC top500 - factoids •Current #1 system has 3,120,000 cores –Located in China, called Tianhe-2 “Milky Way” –Peak speed of 33.9 PetaFLops/second (quadrillions of calculations per second) –Needs 17.8 MW of power •Current #2 system @ ORNL (US Government DoE) in Tennessee –Has 560,640 cores, called Titan –Peak speed of 17.6 PetaFlops/second –Needs 8.21 MW of power
  • 18. HPC top500 - factoids •Tianhe-2
  • 19. HPC top500 - factoids •Titan (http://www.ornl.gov/info/press_releases/get_press_release.cfm?ReleaseNumber=mr20121029-00) •
  • 20. HPC top500 - factoids •Titan
  • 22. Operating Systems: History  Early HPC OS:  tiny assembled loaders  CTSS (Cray Time Sharing Systems) - LTSS  CRAY Operating Systems (COS)  CRAY UNIX ((UNICOS)  mk/Kernel – CHORUS  Beowulf cluster – Linux appears
  • 23. Linux Contributions: History  Linux – attack of the killer micros, 1992
  • 24. Linux Contributions: History  NOW – Network of workstations, 1993
  • 25. Linux Contributions: History 1993-1994  133 nodes – Stone Supercomputer  First Beowulf cluster  Concept pioneered at NASA/Caltech  Thomas Sterling and Donald Becker
  • 26. Linux Contributions: History 1993-1994 •Beowulf
  • 27. Linux Contributions: History 1993-1994 •Beowulf
  • 28. Linux Contributions: History 1993-1994 •Beowulf •NASA •LSU •Indiana University THOMAS STERLING
  • 29. Linux Contributions: History •Beowulf Components: –Parallel Virtual Machine (PVM) – U Tennesse –Message Passing Interface (MPI) – several folks –Jack Dongarra,Tony Hey and David Walker –Support of NSF and ARPA –Today we have the MPI Forum –MPI 2 and now MPI 3 –OpenMPI, MPICH, etc –Future Pthreads and OpenMP,
  • 30. HPC and Linux •Beowulf Attributes (or cluster features): •Open Source •Low Cost •Elasticity •Equal Spread of work (seeds of cloud computing here!!) •These days the Linux kernel can handle 64 cores! HPC pushes this limit even further….
  • 31. HPC and Linux pushing the boundaries •File systems: –Large number of high performance file systems –Lustre, now in version 2.5 –Beats HDFS several times over!! –You can host HDFS over many HPC filesystems for massive gains
  • 32. Typical Stack Pick Distro – Linux based (usually Enterprise class) Hardware
  • 33. Hardware Contributions: _Phase_ 2  Profusion of technologies:  In-memory processing, many HPC sites implemented this  1992 built special systems for the use in cryptography using these technigques  Graph traversal systems – now available as appliances by HPC vendors  Massive memory : single memory systems over several TB in size  Infiniband interconnects: hitting 100 Gbits/sec switches you can buy them now  Parallel processing software foundation -> replacements for MPI stack being worked on
  • 34. Modern Day HPC •Building the ExaScale machine: –Exascale is 1 quintillion calculations/second –1000x Petaflops/sec –Also Known as 10^18 (hence 2018 projections) –1, 000, 000, 000, 000, 000, 000 floating point calculations/second (sustained) –How to feed this monster?
  • 35. Modern Day HPC •Solutions for the ExaScale monster: •Inevitably Big Data community should watch/support/benefit issues we are tackling now: –Memory matters! –Resiliency in software –Robustness in hardware –Co-Design critical –Power Consumption and Cooling (estimate several megawatts w/ present day approaches) –Utterly new architectures needed
  • 36. Applications: What did we use all this for? Weather Automotive and Aerospace Design Traditional Sciences Energy (Nuclear, Oil & Gas) Bioinformatics Cryptography  and……big data
  • 37. Applications: traditional HPC Automotive & (similar) Aerospace Design o Car Crash Analysis – prime usage, 50% o Each physical crash test costs $0.5 million o Virtual Prototype test - $1000 (or less)
  • 38. Applications: The real deal vs. HPC •NHTSA requires physical validation •Before total crash tests cost a total of $100 million/year •Limited to a small suite: 12 tests •Today we can do over 140+ different tests (for each vehicle) and with: –Less cost (we instead increased the # of tests!) –Faster response (5 years v 12 months) –Many more design iterations (Hundreds v 10)
  • 39. HPC for weather forecasting
  • 40. Whither HPC and the Cloud?
  • 41. HPC for Crisis Assist
  • 42. Technology March! Or why simulation matters? •Increasing resolving power - greater fidelity problem •Decreasing product design turnaround times •Increase cost-effectiveness relative to experiment & observation •Reducing uncertainty •Ramping up the ease of use by non-experts •Powerful tool in resolving scientific questions, engineering designs, and policy support •Co-execution environments for simulation, large-scale data enabled science, and scientific visualization •Simple: Better Answers thus delivering an….. –Attractiveness to the creative and entrepreneurial classes –Straightforward case for national economic competitiveness!!!
  • 43. We need more HPC because….
  • 47. HPC is indispensable! •Establish Capability •Enable Adding of Complexity •Gain a real and better Understanding •And do not forget all that data! •How do we tie it in?......
  • 48. Approaching the eXtreme Scale •Current paradigm: Simulation lots of equations which mimic or model actual situations “Third” Paradigm •------------------------------------------------------------------------------------------- •Operate without models (Big Data) “Fourth” Paradigm
  • 49. Operate without models (Big Data) •BEFORE….. * NOW/FUTURE…. Models/Theory Models/Theory DATA
  • 50. Best Example….. tell us what we do not know! •Recent Success: •Solar observations (actual data) •Unknown Surface Perturbations or Energy •Could not be explained by all classical models •Resorted to automated machine learning driven alternate search •Answer: Solar Earthquakes and Thunderclaps, classic acoustic signature! •New profession: Solar Seismologists!
  • 51. Trend began a decade+ ago…. http://research.microsoft.com/en- us/collaboration/fourthparadigm/4th_paradigm_book_complete_lr.pdf
  • 52. Everyone is seriously interested….. http://science.energy.gov/~/media/ascr/ascac/pdf/reports/exascale_subcommittee_report.pdf.
  • 53. A Peek at the Future……. •Yes…we should definitely care •The bigger and more relevant questions are: –What architecture? What programming model? –Power consumption will dominate •Currently 4 approaches: –Stay the course ? Not!! –All GPGPU based? –ARM based? –Quantum Computing ??
  • 54. GPGPU perspective…. G1 G2 G3 G4 8-Cores 8-Cores 16-Core Server Node Multi-GPU Acceleration of a 16-Core ANSYS Fluent Simulation of External Aero Xeon E5-2667 CPUs + Tesla K20X GPUs 2.9X Solver Speedup CPU Configuration CPU + GPU Configuration Click to Launch Movie
  • 55. A Peek at the Future……. –GPGPU GDDR GDDR DDR DDR GPU I/O Hub PCI-Express CPU Cache 1 2 3
  • 56. A Peek at the Future……. –GPGPU based? –http://www.anl.gov/events/overview-nvidia-exascale- processor-architecture-co-design-philosophy-and- application-results –Echelon –DragonFly –http://www.nvidia.com/content/PDF/sc_2010/theater/Dally_SC10.pdf
  • 57. A Peek at the Future…….
  • 58. A Peek at the Future……. –ARM or Pi based? –http://coen.boisestate.edu/ece/files/2013/05/Creating.a. Raspberry.Pi-Based.Beowulf.Cluster_v2.pdf
  • 59. Quantum Computing……. •D-WAVE systems installed at NASA Ames lab •Uses a special chip, 512 bit “Vesuvius” •Uses 12 KW of power •Cooled to 0.02 Degrees K (100 times colder than outer space) •RF shielding
  • 60. Quantum Computing……. •Shor’s Integer Factorization Algorithm •Problem: Given a composite n-bit integer, find a nontrivial factor. –Best-known deterministic algorithm on a classical computer has time complexity exp(O( n1/3log2/3 n)). •A quantum computer can solve this problem in O( n3 ) operations. Peter Shor Algorithms for Quantum Computation: Discrete Logarithms and Factoring Proc. 35thAnnual Symposium on Foundations of Computer Science, 1994, pp. 124-134
  • 61. Quantum Computing……. •Classical: number field sieve –Time complexity: exp(O(n1/3 log2/3 n)) –Time for 512-bit number: 8400 MIPS years –Time for 1024-bit number: 1.6 billion times longer •Quantum: Shor’s algorithm –Time complexity: O(n3) –Time for 512-bit number: 3.5 hours –Time for 1024-bit number: 31 hours •(assuming a 1 GHz quantum machine) See M. Oskin, F. Chong, I. Chuang A Practical Architecture for Reliable Quantum Computers IEEE Computer, 2002, pp. 79-87
  • 62. What I will be looking into……. •Julia •Programming Environment which combines *all* the elements of: –R (express data handling) –Scientific and Engineering process (e.g. MATLAB like) –Parallel processing and distributed computing functional approaches (similar to Scala, Erlang and others) –Python and other integration packages already there –Happy marriage of several arenas – Now in early release •Feel free to contact or follow up with me on this
  • 63. SUMMARY: Core Competencies Across HPC Core Competencies Extreme scale Architecture Compute I/O Memory Storage/data management Tera, Peta, Exabytes…. Visualization and analytics Fast fabrics Future architectural direction Parallelism to extreme parallelism Multi core Programming models Big Data Models, applications, applied analytics Structured, unstructured data types
  • 64. The need for a new discipline: HPC experts + Domain Expertise == Simulation.Specialists Core Competencies Where would this Computational Specialist work? Extreme scale Architecture Compute I/O Memory Storage/data management Tera, Peta, Exabytes…. Visualization and analytics Fast fabrics Future architectural direction Parallelism to extreme parallelism Multi core Programming models Big Data Models, applications, applied analytics Structured, unstructured data types National security Fraud detection Grand challenge science Physics, Chemistry, Biology, Weather/climate, energy etc. Bio/life sciences Healthcare Energy/Geophysics Financial modeling, high frequency and algorithmic trading Entertainment/media Auto/aero/mfg. Consumer Electronics Risk informatics: insurance, global, financial, medical etc. Optimization models Discovery analytics
  • 65. On a lighter note…..
  • 66. On a lighter note…..
  • 67. On a lighter note…..
  • 68. On a lighter note…..
  • 69. On a lighter note…..
  • 70. On a lighter note…..
  • 75. Innovative uses of HPC (LinkedIn.com)
  • 76. Thank you….. •Email: sharan dot kalwani at acm dot org