Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...
GPU Computing
1. Graphics processing units - powerful, programmable, and highly parallel - are increasingly targeting general-purpose computing applications. GPU Computing Presented By: Khan Muhammad Nafee Mostafa 0507007, Dept of CSE, KUET
2. GPU Computing J. D. Owens M. Houston D. Luebke S. Green J. E. Stone J. C. Phillips Proceedings of the IEEE | Vol 96, No. 5 | May 2008 We would be concentrating on, What is GPU Computing Why GPU Computing GPU Architecture and Evolution GPU Computing Model Software Environment Future
4. What is GPU Computing ? GPU computing is the use of a GPU to do general purpose scientific and engineering computing CPU and GPU together in a heterogeneous computing model. Sequential part of the application runs on the CPU and the computationally-intensive part runs on the GPU. From the user’s perspective, the application just runs faster because it is using the high-performance of the GPU to boost performance.
5. Over the past few years, the GPU has evolved from a fixed-function special-purpose processor into a full-fledged parallel programmable processor with additional fixed-function special-purpose functionality Why GPU Computing…
6. GPU for Non-Graphic Apps The GPU is designed for a particular class of applications with the following characteristics, Computational requirements are large Parallelism is substantial Throughput is more important than latency a growing community has identified other applications with similar characteristics and successfully mapped these applications onto the GPU
7. GPU extends its hand towards CPU for performance Parallelism is the future of computing Many applications have to process huge set of data following same functions Several stream processors can execute same set of instructions on different data sets and give a higher throughput If GPU take some share of computation load from CPU, many applications can be benefitted in speed-up
8. GPU is now turned into a programmable engine GPU Architecture and Evolution
11. All GPU programs must be structured in this way: many parallel elements, each processed in parallel by a single program GPU Computing Model
12. Computing on the GPU Programming a GPU for Graphics programmer specifies geometry covering a screen region; rasterizer generates a fragment at each pixel location Each fragment is shaded by the fragment program (FP). FP computes the fragment by a combination of math operations and global memory reads resulting image can be used as texture on future passes.
13. Computing on the GPU Programming a GPU for Graphics Programming a GPU for General-Purpose Programs (Old) programmer specifies geometric primitive covering computation domain of interest; rasterizer generates fragment Each fragment is shaded by an SPMD general purpose FP FP computes the fragment by a combination of math operations and ‘gather’ accesses from global memory. resulting buffer can be used as an input on future passes. programmer specifies geometry covering a screen region; rasterizer generates a fragment at each pixel location Each fragment is shaded by the fragment program (FP). FP computes the fragment by a combination of math operations and global memory reads resulting image can be used as texture on future passes.
14. Computing on the GPU Programming a GPU for General-Purpose Programs (New) programmer directly defines the computation domain of interest as a structured grid of threads SPMD general-purpose program computes each thread each thread is computed by a combination of math operations and both ‘gather’ (read) accesses from and ‘scatter’ (write) accesses to global memory; (same buffer can be used for both allowing more flexible algorithms) resulting buffer in global memory can then be used as an input in future computation
16. Software Environments BrookGPU Microsoft’s Accelerator Vendor Specific GPGPU systems AMD ATI’s CTM (Close to the Metal) NVIDIA’s CUDA (Compute Unified Device Architecture)
17. Scan performance on CPU, graphics-based GPU (using OpenGL), and direct-compute GPU (using CUDA). Results obtained on a GeForce 8800 GTX GPU and Intel Core2-Duo Extreme 2.93 GHz CPU. (Figure adapted from Harris et al.) Scan performance on CPU, OpenGL and CUDA
19. Concluding for bright Future… support for double-precision floating-point higher bandwidth path between CPU and GPU (like ATI’s HyperTransport) more tightly coupled CPU and GPU (AMD’s fusion or nVidianForce) NVIDIA Quadro for Multiple GPU Collaboration Finally, let us wait for new era when GPU Computing will rule