Class Central Tips
This course can apply to various HPC and datacenter workloads and framework including artificial intelligence (AI). You will learn how to handle data parallelism with vector instructions, task parallelism in shared memory with threads, parallelism in distributed memory with message passing, and memory architecture parallelism with optimized data containers. This knowledge will help you to accelerate computational applications by orders of magnitude, all the while keeping your code portable and future-proof.
Prerequisite: programming in C/C++ or Fortran in the Linux environment and Linux shell proficiency (navigation, file copying, editing files in text-based editors, compilation).