Online Training: Fundamentals of Accelerated Computing with CUDA C/C++

4 Nov, 2020

Recently, AI has achieved state-of-the-art performance on a broad spectrum of problems, ranging from computer vision to language translation to protein structure prediction. This performance is the result of the complex interaction among improved algorithmic design, availability of larger and better dataset and the availably of accelerated computing hardware. Currently, the two widely used accelerators are; general purpose graphical processing units (GPGPUs) and tensor processing units (TPUs).

The rise of GPU utilization in scientific computing starts to take place on early 2000s where it was used to accelerate matrix and tensor multiplications. As a large fraction of problems within bioinformatics are embarrassingly parallel problems, GPU have been quickly adapted by the bioinformatics community to solve these problems. For example, BarraCUDA (Langdon et al., 2016) for sequence alignment and CUDA–MEME (Liu et al., 2010) for novel motif discovery.

Hence, to be able to write complete programs from scratch on the GPU, I needed to get a better understanding of the programing models of the GPU. Thus, I registered for this course, from Nvidia’s Deep Learning Institute, where I got to develop software and write Kernels for the GPU using C and CUDA programing languages. This training has given me a much better understanding of CUDA programming and best-practices for developing GPU-oriented code. Currently, I am using the newly developed set of skills to accelerate many of the computational problem I am facing on my main project.

Hesahm El Abd