SlideShare une entreprise Scribd logo
1  sur  24
Télécharger pour lire hors ligne
High Performance Computing

        Adam DeConinck
       R Systems NA, Inc.




        1
Development of models begins at small scale.

Working on your laptop is convenient, simple.

Actual analysis, however, is slow.




    2
Development of models begins at small scale.

Working on your laptop is convenient, simple.

Actual analysis, however, is slow.


“Scaling up” typically means a small server or
fast multi-core desktop.

Speedup exists, but for very large models, not
significant.

Single machines don't scale up forever.


    3
For the largest models, a different approach is required.


                    4
High-Performance Computing involves many
  distinct computer processors working together on
  the same calculation.

Large problems are divided into smaller parts and
  distributed among the many computers.

Usually clusters of quasi-independent computers
  which are coordinated by a central scheduler.


                   5
Typical HPC Cluster

             Login
External
connection                 Ethernet network




                     Scheduler



                                               Computes

                     File Server
                                              High-speed network
                                              (10GigE / Infiniband)

                      6
Performance gains
    High-end
    workstation



                    Duration (s)




                                            Number of cores


     Performance test: stochastic finance model on R Systems cluster

     High-end workstation: 8 cores. Maximum speedup of 20x: 4.5 hrs → 14 minutes
      
          Scale-up heavily model-dependent: 5x – 100x in our tests, can be faster

     No more performance gain after ~500 cores: why? Some operations can't be parallelized.

     Additional cores? Run multiple models simultaneously


                                     7
Performance comes at a price: complexity.



    New paradigm: real-time analysis vs batch jobs.

    Applications must be written specifically to take
    advantage of distributed computing.

    Performance characteristics of applications change.

    Debugging becomes more of a challenge.



                     8
New paradigm: real-time analysis vs batch jobs.




Most small analyses are done in    Large jobs are typically done in a
  real time:                          batch model:

    “At-your-desk” analysis        
                                       Submit job to a queue

    Small models only              
                                       Much larger models

    Fast iterations                
                                       Slow iterations

    No waiting for resources       
                                       May need to wait



                               9
Applications must be written specifically to
  take advantage of distributed
  computing.

    Explicitly split your problem into smaller
    “chunks”

    “Message passing” between processes

    Entire computation can be slowed by one
    or two slow chunks

    Exception: “embarrassingly parallel”
    problems

    Easy-to-split, independent chunks of
    computation

    Thankfully, many useful models fall under
                                                   “Embarrassingly parallel” =
    this heading. (e.g. stochastic models)       No inter-process communication

                              10
Performance characteristics of applications change.


On a single machine:        On a cluster:

    CPU speed (compute)     
                                Single-machine metrics

    Cache                   
                                Network

    Memory                  
                                File server

    Disk                    
                                Scheduler contention
                            
                                Results from other nodes



                   11
Debugging becomes more of a challenge.


    More complexity = more pieces that can fail

    Race conditions: sequence of events no longer deterministic

    Single nodes can “stall” and slow the entire computation

    Scheduler, file server, login server all have their own challenges




                          12
External resources

    One solution to handling complexity: outsource it!

    Historical HPC facilities: universities, national labs
    
        Often have the most absolute compute capacity, and will sell
        excess capacity
    
        Competition with academic projects, typically do not include
        SLA or high-level support

    Dedicated commercial HPC facilities providing “on-demand”
    compute power.



                           13
External HPC                      Internal HPC

    Outsource HPC sysadmin        
                                      Requires in-house expertise

    No hardware investment        
                                      Major investment in hardware

    Pay-as-you-go                 
                                      Possible idle time

    Easy to migrate to new tech   
                                      Upgrades require new hardware




                          14
Internal HPC                          External HPC

    No external contention            
                                          No guaranteed access

    All internal—easy security        
                                          Security arrangements complex

    Full control over configuration   
                                          Limited control of configuration

    Simpler licensing control         
                                          Some licensing complex


    Requires in-house expertise       
                                          Outsource HPC sysadmin

    Major investment in hardware      
                                          No hardware investment

    Possible idle time                
                                          Pay-as-you-go

    Upgrades require new hardware     
                                          Easy to migrate to new tech



                             15
“The Cloud”

    “Cloud computing”: virtual machines, dynamic allocation of resources in
    an external resource

    Lower performance (virtualization), higher flexibility

    Usually no contracts necessary: pay with your credit card, get 16 nodes

    Often have to do all your own sysadmin

    Low support, high control



                              16
CASE STUDY:
Windows cluster for Actuarial
       Application




         17
Global insurance company


    Needed 500-1000 cores on a temporary basis

    Preferred a utility, “pay-as-you-go” model

    Experimenting with external resources for “burst”
    capacity during high-activity periods

    Commercially licensed and supported application

    Requested a proof of concept

                     18
Cluster configuration

    Application embarrassingly parallel, small-to-medium data files,
    computationally and memory-intensive
    
        Prioritize computation (processors), access to fileserver over
        inter-node communication, large storage
    
        Upgraded memory in compute nodes to 2 GB/core

    128-node cluster: 3.0 GHz Intel Xeon processors, 8 cores per node for
    1024 cores total

    Windows 2008 HPC R2 operating system

    Application and fileserver on login node


                            19
Stumbling blocks

    Application optimization
Customer had a wide variety of models which generated different usage
  patterns. (IO, compute, memory-intensive jobs) Required dynamic
  reconfiguration for different conditions.

    Technical issue
Iterative testing process. Application turned out to be generating massive
   fileserver contention. Had to make changes to both software and hardware.

    Human processes
    Users were accustomed to internal access model. Required changes both
    for providers (increase ease-of-use) and users (change workflow)

    Security
    Customer had never worked with an external provider before. Complex
    internal security policy had to be reconciled with remote access.
                            20
Lessons learned:


    Security was the biggest delaying factor. The initial security setup took over
    3 months from the first expression of interest, even though cluster setup
    was done in less than a week.
    
        Only mattered the first time though: subsequent runs started much
        more smoothly.

    A low-cost proof-of-concept run was important to demonstrate feasibility,
    and for working the bugs out.

    A good relationship with the application vendor was extremely important
    to solving problems and properly optimizing the model for performance.



                              21
Recent developments: GPUs




       22
Graphics processing units




    CPU: complex, general-purpose processor

    GPU: highly-specialized parallel processor, optimized for performing operations for
    common graphics routines

    Highly specialized → many more “cores” for same cost and space
     
         Intel Core i7: 4 cores @ 3.4 GHz: $300 = $75/core
     
         NVIDIA Tesla M2070: 448 cores @ 575 MHz: $4500 = $10/core

    Also higher bandwidth: 100+ GB/s for GPU vs 10-30 GB/s for CPU

    Same operations can be adapted for non-graphics applications: “GPGPU”

                                       23
                      Image from http://blogs.nvidia.com/2009/12/whats-the-difference-between-a-cpu-and-a-gpu/
HPC/Actuarial using GPUs
                                                   
                                                        Random-number generation
                                                   
                                                        Finite-difference modeling
                                                   
                                                        Image processing

                                                   
                                                        Numerical Algorithms Group:
                                                        GPU random-number generator
                                                   
                                                        MATLAB: operations on large arrays/matrices
                                                   
                                                        Wolfram Mathematica: symbolic math analysis


Data from
http://www.nvidia.com/object/computational_finan
ce.html




                                                   24

Contenu connexe

Tendances

High performance computing - building blocks, production & perspective
High performance computing - building blocks, production & perspectiveHigh performance computing - building blocks, production & perspective
High performance computing - building blocks, production & perspectiveJason Shih
 
Introduction to High-Performance Computing
Introduction to High-Performance ComputingIntroduction to High-Performance Computing
Introduction to High-Performance ComputingUmarudin Zaenuri
 
High performance computing with accelarators
High performance computing with accelaratorsHigh performance computing with accelarators
High performance computing with accelaratorsEmmanuel college
 
Multiple processor (ppt 2010)
Multiple processor (ppt 2010)Multiple processor (ppt 2010)
Multiple processor (ppt 2010)Arth Ramada
 
Cloud computing
Cloud computingCloud computing
Cloud computingSyam Lal
 
Centralized shared memory architectures
Centralized shared memory architecturesCentralized shared memory architectures
Centralized shared memory architecturesGokuldhev mony
 
Cloud Application architecture styles
Cloud Application architecture styles Cloud Application architecture styles
Cloud Application architecture styles Nilay Shrivastava
 
Multiprocessor
MultiprocessorMultiprocessor
MultiprocessorA B Shinde
 
High Performance Computing using MPI
High Performance Computing using MPIHigh Performance Computing using MPI
High Performance Computing using MPIAnkit Mahato
 
Introduction to parallel_computing
Introduction to parallel_computingIntroduction to parallel_computing
Introduction to parallel_computingMehul Patel
 

Tendances (20)

High performance computing - building blocks, production & perspective
High performance computing - building blocks, production & perspectiveHigh performance computing - building blocks, production & perspective
High performance computing - building blocks, production & perspective
 
High–Performance Computing
High–Performance ComputingHigh–Performance Computing
High–Performance Computing
 
High performance computing
High performance computingHigh performance computing
High performance computing
 
Introduction to High-Performance Computing
Introduction to High-Performance ComputingIntroduction to High-Performance Computing
Introduction to High-Performance Computing
 
High performance computing with accelarators
High performance computing with accelaratorsHigh performance computing with accelarators
High performance computing with accelarators
 
Cluster computing
Cluster computingCluster computing
Cluster computing
 
Distributed Computing ppt
Distributed Computing pptDistributed Computing ppt
Distributed Computing ppt
 
Multiple processor (ppt 2010)
Multiple processor (ppt 2010)Multiple processor (ppt 2010)
Multiple processor (ppt 2010)
 
Cluster Computing
Cluster ComputingCluster Computing
Cluster Computing
 
Cloud computing
Cloud computingCloud computing
Cloud computing
 
Centralized shared memory architectures
Centralized shared memory architecturesCentralized shared memory architectures
Centralized shared memory architectures
 
Perspective on HPC-enabled AI
Perspective on HPC-enabled AIPerspective on HPC-enabled AI
Perspective on HPC-enabled AI
 
Cloud Application architecture styles
Cloud Application architecture styles Cloud Application architecture styles
Cloud Application architecture styles
 
Multiprocessor
MultiprocessorMultiprocessor
Multiprocessor
 
Cluster Computing
Cluster ComputingCluster Computing
Cluster Computing
 
Cloud computing
Cloud computingCloud computing
Cloud computing
 
High Performance Computing using MPI
High Performance Computing using MPIHigh Performance Computing using MPI
High Performance Computing using MPI
 
Introduction to parallel_computing
Introduction to parallel_computingIntroduction to parallel_computing
Introduction to parallel_computing
 
Cluster Computing
Cluster ComputingCluster Computing
Cluster Computing
 
Cloud computing and Cloudsim
Cloud computing and CloudsimCloud computing and Cloudsim
Cloud computing and Cloudsim
 

En vedette

High performance computing
High performance computingHigh performance computing
High performance computingGuy Tel-Zur
 
High performance concrete ppt
High performance concrete pptHigh performance concrete ppt
High performance concrete pptGoogle
 
Intro to High Performance Computing in the AWS Cloud
Intro to High Performance Computing in the AWS CloudIntro to High Performance Computing in the AWS Cloud
Intro to High Performance Computing in the AWS CloudAmazon Web Services
 
INCITE - INtegrated Components for Interactive TEaching
INCITE - INtegrated Components for Interactive TEachingINCITE - INtegrated Components for Interactive TEaching
INCITE - INtegrated Components for Interactive TEachingDragos Sbîrlea
 
High Performance Statistical Computing
High Performance Statistical ComputingHigh Performance Statistical Computing
High Performance Statistical ComputingMicah Altman
 
High performance computing
High performance computingHigh performance computing
High performance computingMaher Alshammari
 
Kalray TURBOCARD2 @ ISC'14
Kalray TURBOCARD2 @ ISC'14Kalray TURBOCARD2 @ ISC'14
Kalray TURBOCARD2 @ ISC'14KALRAY
 
High Performance Computing in the Cloud?
High Performance Computing in the Cloud?High Performance Computing in the Cloud?
High Performance Computing in the Cloud?Ian Lumb
 
High Performance Computing: The Essential tool for a Knowledge Economy
High Performance Computing: The Essential tool for a Knowledge EconomyHigh Performance Computing: The Essential tool for a Knowledge Economy
High Performance Computing: The Essential tool for a Knowledge EconomyIntel IT Center
 
AWS Webcast - An Introduction to High Performance Computing on AWS
AWS Webcast - An Introduction to High Performance Computing on AWSAWS Webcast - An Introduction to High Performance Computing on AWS
AWS Webcast - An Introduction to High Performance Computing on AWSAmazon Web Services
 
Ibm spectrum scale fundamentals workshop for americas part 5 ess gnr-usecases...
Ibm spectrum scale fundamentals workshop for americas part 5 ess gnr-usecases...Ibm spectrum scale fundamentals workshop for americas part 5 ess gnr-usecases...
Ibm spectrum scale fundamentals workshop for americas part 5 ess gnr-usecases...xKinAnx
 
GPFS - graphical intro
GPFS - graphical introGPFS - graphical intro
GPFS - graphical introAlex Balk
 
Multi-faceted Classification of Big Data Use Cases and Proposed Architecture ...
Multi-faceted Classification of Big Data Use Cases and Proposed Architecture ...Multi-faceted Classification of Big Data Use Cases and Proposed Architecture ...
Multi-faceted Classification of Big Data Use Cases and Proposed Architecture ...Geoffrey Fox
 
Unix _linux_fundamentals_for_hpc-_b
Unix  _linux_fundamentals_for_hpc-_bUnix  _linux_fundamentals_for_hpc-_b
Unix _linux_fundamentals_for_hpc-_bMohammad Reza Beygi
 
Parasitic Computing
Parasitic ComputingParasitic Computing
Parasitic Computingjojothish
 
Accelerating Hadoop, Spark, and Memcached with HPC Technologies
Accelerating Hadoop, Spark, and Memcached with HPC TechnologiesAccelerating Hadoop, Spark, and Memcached with HPC Technologies
Accelerating Hadoop, Spark, and Memcached with HPC Technologiesinside-BigData.com
 
Delivering Transformational Solutions to Industry by Dr. Frederick Streitz, D...
Delivering Transformational Solutions to Industry by Dr. Frederick Streitz, D...Delivering Transformational Solutions to Industry by Dr. Frederick Streitz, D...
Delivering Transformational Solutions to Industry by Dr. Frederick Streitz, D...Industrial Partnerships Office
 

En vedette (20)

High performance computing
High performance computingHigh performance computing
High performance computing
 
High performance concrete ppt
High performance concrete pptHigh performance concrete ppt
High performance concrete ppt
 
Intro to High Performance Computing in the AWS Cloud
Intro to High Performance Computing in the AWS CloudIntro to High Performance Computing in the AWS Cloud
Intro to High Performance Computing in the AWS Cloud
 
INCITE - INtegrated Components for Interactive TEaching
INCITE - INtegrated Components for Interactive TEachingINCITE - INtegrated Components for Interactive TEaching
INCITE - INtegrated Components for Interactive TEaching
 
JAWS
JAWSJAWS
JAWS
 
High Performance Statistical Computing
High Performance Statistical ComputingHigh Performance Statistical Computing
High Performance Statistical Computing
 
High performance computing
High performance computingHigh performance computing
High performance computing
 
Kalray TURBOCARD2 @ ISC'14
Kalray TURBOCARD2 @ ISC'14Kalray TURBOCARD2 @ ISC'14
Kalray TURBOCARD2 @ ISC'14
 
High Performance Computing in the Cloud?
High Performance Computing in the Cloud?High Performance Computing in the Cloud?
High Performance Computing in the Cloud?
 
Current Trends in HPC
Current Trends in HPCCurrent Trends in HPC
Current Trends in HPC
 
High Performance Computing: The Essential tool for a Knowledge Economy
High Performance Computing: The Essential tool for a Knowledge EconomyHigh Performance Computing: The Essential tool for a Knowledge Economy
High Performance Computing: The Essential tool for a Knowledge Economy
 
AWS Webcast - An Introduction to High Performance Computing on AWS
AWS Webcast - An Introduction to High Performance Computing on AWSAWS Webcast - An Introduction to High Performance Computing on AWS
AWS Webcast - An Introduction to High Performance Computing on AWS
 
Ibm spectrum scale fundamentals workshop for americas part 5 ess gnr-usecases...
Ibm spectrum scale fundamentals workshop for americas part 5 ess gnr-usecases...Ibm spectrum scale fundamentals workshop for americas part 5 ess gnr-usecases...
Ibm spectrum scale fundamentals workshop for americas part 5 ess gnr-usecases...
 
GPFS - graphical intro
GPFS - graphical introGPFS - graphical intro
GPFS - graphical intro
 
Multi-faceted Classification of Big Data Use Cases and Proposed Architecture ...
Multi-faceted Classification of Big Data Use Cases and Proposed Architecture ...Multi-faceted Classification of Big Data Use Cases and Proposed Architecture ...
Multi-faceted Classification of Big Data Use Cases and Proposed Architecture ...
 
Unix _linux_fundamentals_for_hpc-_b
Unix  _linux_fundamentals_for_hpc-_bUnix  _linux_fundamentals_for_hpc-_b
Unix _linux_fundamentals_for_hpc-_b
 
Parasitic Computing
Parasitic ComputingParasitic Computing
Parasitic Computing
 
Accelerating Hadoop, Spark, and Memcached with HPC Technologies
Accelerating Hadoop, Spark, and Memcached with HPC TechnologiesAccelerating Hadoop, Spark, and Memcached with HPC Technologies
Accelerating Hadoop, Spark, and Memcached with HPC Technologies
 
Delivering Transformational Solutions to Industry by Dr. Frederick Streitz, D...
Delivering Transformational Solutions to Industry by Dr. Frederick Streitz, D...Delivering Transformational Solutions to Industry by Dr. Frederick Streitz, D...
Delivering Transformational Solutions to Industry by Dr. Frederick Streitz, D...
 
Biometric technology
Biometric technologyBiometric technology
Biometric technology
 

Similaire à High Performance Computing: an Introduction for the Society of Actuaries

Webinar: High Performance MongoDB Applications with IBM POWER8
Webinar: High Performance MongoDB Applications with IBM POWER8Webinar: High Performance MongoDB Applications with IBM POWER8
Webinar: High Performance MongoDB Applications with IBM POWER8MongoDB
 
Backend.AI Technical Introduction (19.09 / 2019 Autumn)
Backend.AI Technical Introduction (19.09 / 2019 Autumn)Backend.AI Technical Introduction (19.09 / 2019 Autumn)
Backend.AI Technical Introduction (19.09 / 2019 Autumn)Lablup Inc.
 
Matching Your Costs to Your DAU: Thin Client Back-End Infrastructure Made Easy
Matching Your Costs to Your DAU: Thin Client Back-End Infrastructure Made EasyMatching Your Costs to Your DAU: Thin Client Back-End Infrastructure Made Easy
Matching Your Costs to Your DAU: Thin Client Back-End Infrastructure Made EasyPete Johnson
 
Microsoft Azure in HPC scenarios
Microsoft Azure in HPC scenariosMicrosoft Azure in HPC scenarios
Microsoft Azure in HPC scenariosmictc
 
InTech Event | Cognitive Infrastructure for Enterprise AI
InTech Event | Cognitive Infrastructure for Enterprise AIInTech Event | Cognitive Infrastructure for Enterprise AI
InTech Event | Cognitive Infrastructure for Enterprise AIInTTrust S.A.
 
From Rack scale computers to Warehouse scale computers
From Rack scale computers to Warehouse scale computersFrom Rack scale computers to Warehouse scale computers
From Rack scale computers to Warehouse scale computersRyousei Takano
 
Scaling Security on 100s of Millions of Mobile Devices Using Apache Kafka® an...
Scaling Security on 100s of Millions of Mobile Devices Using Apache Kafka® an...Scaling Security on 100s of Millions of Mobile Devices Using Apache Kafka® an...
Scaling Security on 100s of Millions of Mobile Devices Using Apache Kafka® an...confluent
 
Cassandra in Operation
Cassandra in OperationCassandra in Operation
Cassandra in Operationniallmilton
 
X13 Pre-Release Update featuring 4th Gen Intel® Xeon® Scalable Processors
X13 Pre-Release Update featuring 4th Gen Intel® Xeon® Scalable Processors X13 Pre-Release Update featuring 4th Gen Intel® Xeon® Scalable Processors
X13 Pre-Release Update featuring 4th Gen Intel® Xeon® Scalable Processors Rebekah Rodriguez
 
Performance testing virtualized systems v5
Performance testing virtualized systems v5Performance testing virtualized systems v5
Performance testing virtualized systems v5Mentora
 
A Survey on in-a-box parallel computing and its implications on system softwa...
A Survey on in-a-box parallel computing and its implications on system softwa...A Survey on in-a-box parallel computing and its implications on system softwa...
A Survey on in-a-box parallel computing and its implications on system softwa...ChangWoo Min
 
Computação de Alto Desempenho - Fator chave para a competitividade do País, d...
Computação de Alto Desempenho - Fator chave para a competitividade do País, d...Computação de Alto Desempenho - Fator chave para a competitividade do País, d...
Computação de Alto Desempenho - Fator chave para a competitividade do País, d...Igor José F. Freitas
 
Adaptive Computing Using PlateSpin Orchestrate
Adaptive Computing Using PlateSpin OrchestrateAdaptive Computing Using PlateSpin Orchestrate
Adaptive Computing Using PlateSpin OrchestrateNovell
 
Applying Cloud Techniques to Address Complexity in HPC System Integrations
Applying Cloud Techniques to Address Complexity in HPC System IntegrationsApplying Cloud Techniques to Address Complexity in HPC System Integrations
Applying Cloud Techniques to Address Complexity in HPC System Integrationsinside-BigData.com
 
Cloud Roundtable at Microsoft Switzerland
Cloud Roundtable at Microsoft Switzerland Cloud Roundtable at Microsoft Switzerland
Cloud Roundtable at Microsoft Switzerland mictc
 
Deep Dive on Delivering Amazon EC2 Instance Performance
Deep Dive on Delivering Amazon EC2 Instance PerformanceDeep Dive on Delivering Amazon EC2 Instance Performance
Deep Dive on Delivering Amazon EC2 Instance PerformanceAmazon Web Services
 
Choosing the Right EC2 Instance and Applicable Use Cases - AWS June 2016 Webi...
Choosing the Right EC2 Instance and Applicable Use Cases - AWS June 2016 Webi...Choosing the Right EC2 Instance and Applicable Use Cases - AWS June 2016 Webi...
Choosing the Right EC2 Instance and Applicable Use Cases - AWS June 2016 Webi...Amazon Web Services
 
Deep learning for FinTech
Deep learning for FinTechDeep learning for FinTech
Deep learning for FinTechgeetachauhan
 
Best Practices for On-Demand HPC in Enterprises
Best Practices for On-Demand HPC in EnterprisesBest Practices for On-Demand HPC in Enterprises
Best Practices for On-Demand HPC in Enterprisesgeetachauhan
 

Similaire à High Performance Computing: an Introduction for the Society of Actuaries (20)

Webinar: High Performance MongoDB Applications with IBM POWER8
Webinar: High Performance MongoDB Applications with IBM POWER8Webinar: High Performance MongoDB Applications with IBM POWER8
Webinar: High Performance MongoDB Applications with IBM POWER8
 
B9 cmis
B9 cmisB9 cmis
B9 cmis
 
Backend.AI Technical Introduction (19.09 / 2019 Autumn)
Backend.AI Technical Introduction (19.09 / 2019 Autumn)Backend.AI Technical Introduction (19.09 / 2019 Autumn)
Backend.AI Technical Introduction (19.09 / 2019 Autumn)
 
Matching Your Costs to Your DAU: Thin Client Back-End Infrastructure Made Easy
Matching Your Costs to Your DAU: Thin Client Back-End Infrastructure Made EasyMatching Your Costs to Your DAU: Thin Client Back-End Infrastructure Made Easy
Matching Your Costs to Your DAU: Thin Client Back-End Infrastructure Made Easy
 
Microsoft Azure in HPC scenarios
Microsoft Azure in HPC scenariosMicrosoft Azure in HPC scenarios
Microsoft Azure in HPC scenarios
 
InTech Event | Cognitive Infrastructure for Enterprise AI
InTech Event | Cognitive Infrastructure for Enterprise AIInTech Event | Cognitive Infrastructure for Enterprise AI
InTech Event | Cognitive Infrastructure for Enterprise AI
 
From Rack scale computers to Warehouse scale computers
From Rack scale computers to Warehouse scale computersFrom Rack scale computers to Warehouse scale computers
From Rack scale computers to Warehouse scale computers
 
Scaling Security on 100s of Millions of Mobile Devices Using Apache Kafka® an...
Scaling Security on 100s of Millions of Mobile Devices Using Apache Kafka® an...Scaling Security on 100s of Millions of Mobile Devices Using Apache Kafka® an...
Scaling Security on 100s of Millions of Mobile Devices Using Apache Kafka® an...
 
Cassandra in Operation
Cassandra in OperationCassandra in Operation
Cassandra in Operation
 
X13 Pre-Release Update featuring 4th Gen Intel® Xeon® Scalable Processors
X13 Pre-Release Update featuring 4th Gen Intel® Xeon® Scalable Processors X13 Pre-Release Update featuring 4th Gen Intel® Xeon® Scalable Processors
X13 Pre-Release Update featuring 4th Gen Intel® Xeon® Scalable Processors
 
Performance testing virtualized systems v5
Performance testing virtualized systems v5Performance testing virtualized systems v5
Performance testing virtualized systems v5
 
A Survey on in-a-box parallel computing and its implications on system softwa...
A Survey on in-a-box parallel computing and its implications on system softwa...A Survey on in-a-box parallel computing and its implications on system softwa...
A Survey on in-a-box parallel computing and its implications on system softwa...
 
Computação de Alto Desempenho - Fator chave para a competitividade do País, d...
Computação de Alto Desempenho - Fator chave para a competitividade do País, d...Computação de Alto Desempenho - Fator chave para a competitividade do País, d...
Computação de Alto Desempenho - Fator chave para a competitividade do País, d...
 
Adaptive Computing Using PlateSpin Orchestrate
Adaptive Computing Using PlateSpin OrchestrateAdaptive Computing Using PlateSpin Orchestrate
Adaptive Computing Using PlateSpin Orchestrate
 
Applying Cloud Techniques to Address Complexity in HPC System Integrations
Applying Cloud Techniques to Address Complexity in HPC System IntegrationsApplying Cloud Techniques to Address Complexity in HPC System Integrations
Applying Cloud Techniques to Address Complexity in HPC System Integrations
 
Cloud Roundtable at Microsoft Switzerland
Cloud Roundtable at Microsoft Switzerland Cloud Roundtable at Microsoft Switzerland
Cloud Roundtable at Microsoft Switzerland
 
Deep Dive on Delivering Amazon EC2 Instance Performance
Deep Dive on Delivering Amazon EC2 Instance PerformanceDeep Dive on Delivering Amazon EC2 Instance Performance
Deep Dive on Delivering Amazon EC2 Instance Performance
 
Choosing the Right EC2 Instance and Applicable Use Cases - AWS June 2016 Webi...
Choosing the Right EC2 Instance and Applicable Use Cases - AWS June 2016 Webi...Choosing the Right EC2 Instance and Applicable Use Cases - AWS June 2016 Webi...
Choosing the Right EC2 Instance and Applicable Use Cases - AWS June 2016 Webi...
 
Deep learning for FinTech
Deep learning for FinTechDeep learning for FinTech
Deep learning for FinTech
 
Best Practices for On-Demand HPC in Enterprises
Best Practices for On-Demand HPC in EnterprisesBest Practices for On-Demand HPC in Enterprises
Best Practices for On-Demand HPC in Enterprises
 

Dernier

Powerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time ClashPowerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time Clashcharlottematthew16
 
How to write a Business Continuity Plan
How to write a Business Continuity PlanHow to write a Business Continuity Plan
How to write a Business Continuity PlanDatabarracks
 
DSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine TuningDSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine TuningLars Bell
 
Human Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsHuman Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsMark Billinghurst
 
Unleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubUnleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubKalema Edgar
 
WordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your BrandWordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your Brandgvaughan
 
Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Commit University
 
The Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsThe Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsPixlogix Infotech
 
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc
 
Search Engine Optimization SEO PDF for 2024.pdf
Search Engine Optimization SEO PDF for 2024.pdfSearch Engine Optimization SEO PDF for 2024.pdf
Search Engine Optimization SEO PDF for 2024.pdfRankYa
 
How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.Curtis Poe
 
Connect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationConnect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationSlibray Presentation
 
Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Mattias Andersson
 
H2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo Day
H2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo DayH2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo Day
H2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo DaySri Ambati
 
DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsDevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsSergiu Bodiu
 
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek SchlawackFwdays
 
"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr BaganFwdays
 
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage CostLeverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage CostZilliz
 
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024BookNet Canada
 
Vertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering TipsVertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering TipsMiki Katsuragi
 

Dernier (20)

Powerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time ClashPowerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time Clash
 
How to write a Business Continuity Plan
How to write a Business Continuity PlanHow to write a Business Continuity Plan
How to write a Business Continuity Plan
 
DSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine TuningDSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine Tuning
 
Human Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsHuman Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR Systems
 
Unleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubUnleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding Club
 
WordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your BrandWordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your Brand
 
Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!
 
The Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsThe Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and Cons
 
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
 
Search Engine Optimization SEO PDF for 2024.pdf
Search Engine Optimization SEO PDF for 2024.pdfSearch Engine Optimization SEO PDF for 2024.pdf
Search Engine Optimization SEO PDF for 2024.pdf
 
How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.
 
Connect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationConnect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck Presentation
 
Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?
 
H2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo Day
H2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo DayH2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo Day
H2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo Day
 
DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsDevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platforms
 
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
 
"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan
 
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage CostLeverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
 
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
 
Vertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering TipsVertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering Tips
 

High Performance Computing: an Introduction for the Society of Actuaries

  • 1. High Performance Computing Adam DeConinck R Systems NA, Inc. 1
  • 2. Development of models begins at small scale. Working on your laptop is convenient, simple. Actual analysis, however, is slow. 2
  • 3. Development of models begins at small scale. Working on your laptop is convenient, simple. Actual analysis, however, is slow. “Scaling up” typically means a small server or fast multi-core desktop. Speedup exists, but for very large models, not significant. Single machines don't scale up forever. 3
  • 4. For the largest models, a different approach is required. 4
  • 5. High-Performance Computing involves many distinct computer processors working together on the same calculation. Large problems are divided into smaller parts and distributed among the many computers. Usually clusters of quasi-independent computers which are coordinated by a central scheduler. 5
  • 6. Typical HPC Cluster Login External connection Ethernet network Scheduler Computes File Server High-speed network (10GigE / Infiniband) 6
  • 7. Performance gains High-end workstation Duration (s) Number of cores  Performance test: stochastic finance model on R Systems cluster  High-end workstation: 8 cores. Maximum speedup of 20x: 4.5 hrs → 14 minutes  Scale-up heavily model-dependent: 5x – 100x in our tests, can be faster  No more performance gain after ~500 cores: why? Some operations can't be parallelized.  Additional cores? Run multiple models simultaneously 7
  • 8. Performance comes at a price: complexity.  New paradigm: real-time analysis vs batch jobs.  Applications must be written specifically to take advantage of distributed computing.  Performance characteristics of applications change.  Debugging becomes more of a challenge. 8
  • 9. New paradigm: real-time analysis vs batch jobs. Most small analyses are done in Large jobs are typically done in a real time: batch model:  “At-your-desk” analysis  Submit job to a queue  Small models only  Much larger models  Fast iterations  Slow iterations  No waiting for resources  May need to wait 9
  • 10. Applications must be written specifically to take advantage of distributed computing.  Explicitly split your problem into smaller “chunks”  “Message passing” between processes  Entire computation can be slowed by one or two slow chunks  Exception: “embarrassingly parallel” problems  Easy-to-split, independent chunks of computation  Thankfully, many useful models fall under “Embarrassingly parallel” = this heading. (e.g. stochastic models) No inter-process communication 10
  • 11. Performance characteristics of applications change. On a single machine: On a cluster:  CPU speed (compute)  Single-machine metrics  Cache  Network  Memory  File server  Disk  Scheduler contention  Results from other nodes 11
  • 12. Debugging becomes more of a challenge.  More complexity = more pieces that can fail  Race conditions: sequence of events no longer deterministic  Single nodes can “stall” and slow the entire computation  Scheduler, file server, login server all have their own challenges 12
  • 13. External resources  One solution to handling complexity: outsource it!  Historical HPC facilities: universities, national labs  Often have the most absolute compute capacity, and will sell excess capacity  Competition with academic projects, typically do not include SLA or high-level support  Dedicated commercial HPC facilities providing “on-demand” compute power. 13
  • 14. External HPC Internal HPC  Outsource HPC sysadmin  Requires in-house expertise  No hardware investment  Major investment in hardware  Pay-as-you-go  Possible idle time  Easy to migrate to new tech  Upgrades require new hardware 14
  • 15. Internal HPC External HPC  No external contention  No guaranteed access  All internal—easy security  Security arrangements complex  Full control over configuration  Limited control of configuration  Simpler licensing control  Some licensing complex  Requires in-house expertise  Outsource HPC sysadmin  Major investment in hardware  No hardware investment  Possible idle time  Pay-as-you-go  Upgrades require new hardware  Easy to migrate to new tech 15
  • 16. “The Cloud”  “Cloud computing”: virtual machines, dynamic allocation of resources in an external resource  Lower performance (virtualization), higher flexibility  Usually no contracts necessary: pay with your credit card, get 16 nodes  Often have to do all your own sysadmin  Low support, high control 16
  • 17. CASE STUDY: Windows cluster for Actuarial Application 17
  • 18. Global insurance company  Needed 500-1000 cores on a temporary basis  Preferred a utility, “pay-as-you-go” model  Experimenting with external resources for “burst” capacity during high-activity periods  Commercially licensed and supported application  Requested a proof of concept 18
  • 19. Cluster configuration  Application embarrassingly parallel, small-to-medium data files, computationally and memory-intensive  Prioritize computation (processors), access to fileserver over inter-node communication, large storage  Upgraded memory in compute nodes to 2 GB/core  128-node cluster: 3.0 GHz Intel Xeon processors, 8 cores per node for 1024 cores total  Windows 2008 HPC R2 operating system  Application and fileserver on login node 19
  • 20. Stumbling blocks  Application optimization Customer had a wide variety of models which generated different usage patterns. (IO, compute, memory-intensive jobs) Required dynamic reconfiguration for different conditions.  Technical issue Iterative testing process. Application turned out to be generating massive fileserver contention. Had to make changes to both software and hardware.  Human processes Users were accustomed to internal access model. Required changes both for providers (increase ease-of-use) and users (change workflow)  Security Customer had never worked with an external provider before. Complex internal security policy had to be reconciled with remote access. 20
  • 21. Lessons learned:  Security was the biggest delaying factor. The initial security setup took over 3 months from the first expression of interest, even though cluster setup was done in less than a week.  Only mattered the first time though: subsequent runs started much more smoothly.  A low-cost proof-of-concept run was important to demonstrate feasibility, and for working the bugs out.  A good relationship with the application vendor was extremely important to solving problems and properly optimizing the model for performance. 21
  • 23. Graphics processing units  CPU: complex, general-purpose processor  GPU: highly-specialized parallel processor, optimized for performing operations for common graphics routines  Highly specialized → many more “cores” for same cost and space  Intel Core i7: 4 cores @ 3.4 GHz: $300 = $75/core  NVIDIA Tesla M2070: 448 cores @ 575 MHz: $4500 = $10/core  Also higher bandwidth: 100+ GB/s for GPU vs 10-30 GB/s for CPU  Same operations can be adapted for non-graphics applications: “GPGPU” 23 Image from http://blogs.nvidia.com/2009/12/whats-the-difference-between-a-cpu-and-a-gpu/
  • 24. HPC/Actuarial using GPUs  Random-number generation  Finite-difference modeling  Image processing  Numerical Algorithms Group: GPU random-number generator  MATLAB: operations on large arrays/matrices  Wolfram Mathematica: symbolic math analysis Data from http://www.nvidia.com/object/computational_finan ce.html 24