Large-scale simulations of engineering and basic science problems require efficient use of modern high performance computing (HPC) infrastructure. This course is primarily aimed to introduce the concepts of high performance computing (HPC) to science and engineering students. Different parallel computing tools like MPI, OpenMP and CUDA will be discussed in connection with domain specific problems. This course will briefly introduce parallel architecture and discuss the performance metrics of HPC programs. Multi-CPU computing using both distributed and shared memory architecture will be discussed and OpenMP and MPI based parallelization of iterative matrix solvers will be discussed. Graphics processing unit (GPU) architecture and concepts of CUDA will be discussed. Matrix calculations using CUDA will be demonstrated.
INTENDED AUDIENCE : Basic science, Mathematics, EngineeringPREREQUISITES :Backgrounds in numerical methods and basic programming will be of helpINDUSTRIES SUPPORT :Not Applicable
Week 1:Introduction to high performance computing (1) , Architectures for parallel computing- Flynn;s Taxonomy (2), Shared and distributed memory (1), Examples of parallel algorithms (1Week 2:Performance metrics- speed up, scalability (2) Communication overheads and latency (1) Introduction to OpenMP (2)Week 3:OpenMP parallelization of Matrix algebra algorithms (2) Demonstration of OpenMP codes (1) Introduction to MPI (2)Week 4:Communications using MPI (1) Domain decomposition and Schwarz algorithm, Load balancing (4) Week 5:Jacobi solver – serial and parallel implementation (3) Code demonstration and performance evaluation (2) Week 6:Architectures of GPU, GPU Memories (3) Introduction to CUDA (2)Week 7:Thread algebra for matrix calculations (3) Examples of CUDA kernels (2)Week 8:Matrix algebra using CUDA (2) Performance optimization (2) Code demonstration (1)