Computer Science

What is GPU Computing?

GPU computing or GPGPU is the use of a GPU (graphics processing unit) to do general purpose scientific and engineering computing. The model for GPU computing is to use a CPU and GPU together in a heterogeneous co-processing computing model. The sequential part of the application runs on the CPU and the computationally-intensive part is accelerated by the GPU. From the user’s perspective, the application just runs faster because it is using the high-performance of the GPU to boost performance. gpu-computing-feature The GPU has evolved over the years to have teraflops of floating point performance. NVIDIA revolutionized the GPGPU and accelerated computing world in 2006-2007 by introducing its new massively parallel architecture called “CUDA”. The CUDA architecture consists of 100s of processor cores that operate together to crunch through the data set in the application. The success of GPGPUs in the past few years has been the ease of programming of the associated CUDA parallel programming model. In this programming model, the application developer modify their application to take the compute-intensive kernels and map them to the GPU. The rest of the application remains on the CPU. Mapping a function to the GPU involves rewriting the function to expose the parallelism in the function and adding “C” keywords to move data to and from the GPU. The developer is tasked with launching 10s of 1000s of threads simultaneously. The GPU hardware manages the threads and does thread scheduling. The Tesla 20-series GPU is based on the “Fermi” architecture, which is the latest CUDA architecture. Fermi is optimized for scientific applications with key features such as 500+ gigaflops of IEEE standard double precision floating point hardware support, L1 and L2 caches, ECC memory error protection, local user-managed data caches in the form of shared memory dispersed throughout the GPU, coalesced memory accesses and so on.