Prototyping accelerated applications using Python CUDA
A lot of esoteric problems can’t be solved simply by throwing more CPUs at it. That’s where GPGPUs come into play as long as the problem can be converted into a decent parallelizable one. CUDA is used in quite a few frameworks like graphics(eg. OptiX), recommender systems(eg. Theano) and Machine Learning(eg. Tensorflow).
Writing CUDA code essentially needs you to think extensively about the problem in a parallel manner and write efficient C/C++ code.
CUDA has python support which allows you to prototype fast on the concepts of any new problem that you might want to tackle. This talk will sit comfortably between treating these frameworks as a blackbox and actually having to write C/C++ code. You’ll know how these frameworks use CUDA and how you can prototype algorithms in Python.
- Brief overview on GPU architecture
- CUDA 101: Mandelbrot
- CUDA 102: Raytracing
- Examples in numbapro(python binding for CUDA)
- Building a new operation from ground up
- Trying to find a good fit in already existing frameworks
- Prototyping in Python
- CUDA vs OpenCL
Possibility: device for SSHing into a VM.
General familiarity with low level systems concepts: pipelining in CPU/interleaved memory banks would be beneficial.
I am a senior undergrad at IIIT-Delhi. I have 2 years of software engineering experience and interned at Google, India in 2016. I have been working on CUDA for the past 1 year. I am highly interested in code architecture, low level system design and machine learning. I have given multiple in-college talks on Git internals and Django.