Overview
Explore best practices for writing CUDA kernels using Python in this 38-minute conference talk from NVIDIA GTC 2025. Discover how to harness GPU acceleration without leaving Python, gaining clear understanding of CUDA kernel structure and functionality for Python applications. Learn essential concepts including memory management, thread organization, and optimization techniques critical for high performance. Presented by Leo Fang, Python CUDA Tech Lead at NVIDIA, this session (S72449) covers development and optimization topics related to programming languages and compilers, featuring CUDA and CUDA-X technologies. Suitable for general interest audiences looking to expand their knowledge of GPU programming in Python.
Syllabus
1001 Ways to Write CUDA Kernels in Python | NVIDIA GTC 2025 Session
Taught by
NVIDIA Developer