Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.
ArrayFire Webinar:OpenCL and CUDA Trade-Offs and Comparisons
GPU Software Features              ProgrammabilityPortability                 ScalabilityPerformance                Commun...
Leading GPU Computing Platforms              ProgrammabilityPortability                 ScalabilityPerformance            ...
Performance• Both CUDA and OpenCL are fast• Both can fully utilize the hardware• Devil in the details:  – Hardware type, a...
Performance• ArrayFire results at end of webinar
Scalability• Laptops --> Single GPU Machine  – Both CUDA and OpenCL scale, no code change• Single GPU Machine --> Multi-GP...
ScalabilityInteresting developments:• Memory Transfer Optimizations  – CUDA GPUDirect technology• Mobile GPU Computing  – ...
Scalability• Laptops --> Single GPU Machine   – ArrayFire’s JIT optimizes for GPU type• Single GPU Machine --> Multi-GPU M...
Portability• CUDA is NVIDIA-only  – Open source announcement  – Does not provide CPU fallback• OpenCL is the open industry...
Portability• ArrayFire is fully portable  – Same ArrayFire code runs on CUDA or OpenCL  – Simply select the right library
Community• NVIDIA CUDA Forums – 26,893 topics• AMD OpenCL Forums – 4,038 topics• Stackoverflow CUDA Tag – 1,709 tags• Stac...
Community• AccelerEyes GPU Forums – 1,435 topics  – Largest GPU forums by a software company• Next largest  – PGI GPU Foru...
Programmability• Both CUDA and OpenCL are low-level  – Time consuming kernel development  – Data-parallel algorithm design...
Programmability      Faster                   SSE or      Slower                    AVX               Time-consuming   Eas...
Programmability                  Writing      Faster                  Kernels                   SSE or      Slower        ...
Programmability                  Writing      Faster                  Kernels                   SSE or       Compiler     ...
Programmability                  Writing          Using      Faster                  Kernels        Libraries             ...
Libraries Make the DifferenceRaw math libraries in NVIDIA CUDA• CUBLAS, CUFFT, CULA, Magma  – Provides all BLAS, LAPACK, a...
Libraries Make the DifferenceRaw math libraries in AMD OpenCL• clAmdBlas, clAmdFft  – Provides most important Blas routine...
Libraries Make the DifferenceRaw math libraries in AMD OpenCL• clAmdBlas, clAmdFft  – Provides most important Blas routine...
ArrayFire Code and Benchmarks
Upcoming SlideShare
Loading in …5
×
Upcoming SlideShare
Data Science Week 2016. NVIDIA. "Платформы и инструменты для реализации систем искусственного интеллекта"
Next
Download to read offline and view in fullscreen.

1

Share

Download to read offline

CUDA vs OpenCL

Download to read offline

The slidedeck from a webinar given

Related Books

Free with a 30 day trial from Scribd

See all

CUDA vs OpenCL

  1. 1. ArrayFire Webinar:OpenCL and CUDA Trade-Offs and Comparisons
  2. 2. GPU Software Features ProgrammabilityPortability ScalabilityPerformance Community
  3. 3. Leading GPU Computing Platforms ProgrammabilityPortability ScalabilityPerformance Community
  4. 4. Performance• Both CUDA and OpenCL are fast• Both can fully utilize the hardware• Devil in the details: – Hardware type, algorithm type, code quality
  5. 5. Performance• ArrayFire results at end of webinar
  6. 6. Scalability• Laptops --> Single GPU Machine – Both CUDA and OpenCL scale, no code change• Single GPU Machine --> Multi-GPU Machine – User managed, low-level synchronization• Multi-GPU Machine --> Cluster – MPI
  7. 7. ScalabilityInteresting developments:• Memory Transfer Optimizations – CUDA GPUDirect technology• Mobile GPU Computing – OpenCL available on ARM, Imgtec, Freescale, …
  8. 8. Scalability• Laptops --> Single GPU Machine – ArrayFire’s JIT optimizes for GPU type• Single GPU Machine --> Multi-GPU Machine – ArrayFire’s deviceset() function is super easy• Multi-GPU Machine --> Cluster – MPI
  9. 9. Portability• CUDA is NVIDIA-only – Open source announcement – Does not provide CPU fallback• OpenCL is the open industry standard – Runs on AMD, Intel, and NVIDIA – Provides CPU fallback
  10. 10. Portability• ArrayFire is fully portable – Same ArrayFire code runs on CUDA or OpenCL – Simply select the right library
  11. 11. Community• NVIDIA CUDA Forums – 26,893 topics• AMD OpenCL Forums – 4,038 topics• Stackoverflow CUDA Tag – 1,709 tags• Stackoverflow OpenCL Tag – 564 tags
  12. 12. Community• AccelerEyes GPU Forums – 1,435 topics – Largest GPU forums by a software company• Next largest – PGI GPU Forums – 485 topics
  13. 13. Programmability• Both CUDA and OpenCL are low-level – Time consuming kernel development – Data-parallel algorithm design• Focus on programmability interfaces
  14. 14. Programmability Faster SSE or Slower AVX Time-consuming Easy-to-use
  15. 15. Programmability Writing Faster Kernels SSE or Slower AVX Time-consuming Easy-to-use
  16. 16. Programmability Writing Faster Kernels SSE or Compiler Slower AVX Directives Time-consuming Easy-to-use
  17. 17. Programmability Writing Using Faster Kernels Libraries SSE or Compiler Slower AVX Directives Time-consuming Easy-to-use
  18. 18. Libraries Make the DifferenceRaw math libraries in NVIDIA CUDA• CUBLAS, CUFFT, CULA, Magma – Provides all BLAS, LAPACK, and FFT routines necessary for most dense matrix operations• CUSPARSE – A good start for sparse linear algebra
  19. 19. Libraries Make the DifferenceRaw math libraries in AMD OpenCL• clAmdBlas, clAmdFft – Provides most important Blas routines – Provides radix 2, 3, and 5 FFT routines – No LAPACK support – No sparse data support
  20. 20. Libraries Make the DifferenceRaw math libraries in AMD OpenCL• clAmdBlas, clAmdFft – Provides most important Blas routines – Provides radix 2, 3, and 5 FFT routines – No LAPACK support – No sparse data support – Runs on any OpenCL-compliant device
  21. 21. ArrayFire Code and Benchmarks
  • camelotsa

    Aug. 12, 2013

The slidedeck from a webinar given

Views

Total views

2,588

On Slideshare

0

From embeds

0

Number of embeds

2

Actions

Downloads

48

Shares

0

Comments

0

Likes

1

×