GPU Programming with Modern C++

Overview

This course aims to teach the fundamentals of parallel programming using modern C++. The learning outcomes include understanding when to use parallelism, making optimal choices, and implementing common parallel patterns like reduce, map, and scan. Students will learn to utilize the C++ standard threading library and extend parallelism to heterogeneous devices, specifically GPUs, using the SYCL programming model. The course covers topics such as recognizing task vs data parallelism, CPU vs GPU architectures, memory optimization, GPU programming, parallelization, and concurrency. The intended audience for this course includes software developers interested in leveraging parallel programming techniques to enhance software performance on multi-core and heterogeneous architectures.

Syllabus

Introduction
Michaels background
Why Im here
Outline
Recap
Right abstraction
Task vs data parallelism
Latency and bandwidth
DMV example
Flynns Taxonomy
CPU vs GPU
Multicore CPU
Architectures
Memory bound problem
Memory optimization
Pad properly
Data layout
Power of computing
What happened
What happened in 2011
CPU vs GPU performance
GPU explosion
Hardware
GPU programming
Parallelization and concurrency
Heterogeneity
Consumer AI
GPU languages
C executives
How GPUs work
How CPUs work
How GPU work
Memory regions
Multiple work items
Wavefronts
Lockstep
Kernel barriers
Summary
Code
SpinD
ND Range
Sickle
Chronos

Taught by

ACCU Conference

Reviews

Start your review of GPU Programming with Modern C++

BloomTech’s Downfall: A Long Time Coming

Most common

Popular subjects

Popular courses

GPU Programming with Modern C++

Overview

Syllabus

Taught by

Reviews

BloomTech’s Downfall: A Long Time Coming

Taught by

Heterogeneous Programming in C++ Today

Cloud developer

Never Stop Learning.