GPU Programming in Pure Python
- Track:
- ~ None of these topics
- Type:
- Talk
- Level:
- intermediate
- Duration:
- 30 minutes
Abstract
GPU programming can be scary, but doesn't need to be! Did you know you can access the full performance of CUDA purely in Python? With the full CUDA Python stack, you have a friendly interface to get you started with GPU acceleration.
In this example-driven talk, we'll begin with a general discussion of the CUDA model and how to manage accelerator devices in Python with cuda.core. Next, we'll teach you how to create arrays and launch work with CuPy. Then, you'll learn how to customize parallel algorithms with cuda.compute and write your own kernels that leverage cooperative algorithms with cuda.coop, and integrate seamlessly with accelerated libraries such as cuBLAS and cuDNN.
We'll look at a variety of parallel examples, from counting words, to implementing softmax and row-wise reductions.
By the time the talk is over, you'll be ready to start accelerating your Python code with GPUs!