Class Central is learner-supported. When you buy through links on our site, we may earn an affiliate commission.

Johns Hopkins University

YARN MapReduce Architecture and Advanced Programming

Johns Hopkins University via Coursera

Overview

The course "YARN MapReduce Architecture and Advanced Programming" provides an in-depth understanding of YARN and MapReduce architectures, focusing on their components and capabilities. Students will explore the MapReduce programming model and learn essential optimization techniques such as combiners, partitioners, and compression to improve job performance. The course covers Mapper and Reducer parallelism in MapReduce, along with practical steps for writing and configuring MapReduce jobs. Advanced topics such as multithreading, speculative execution, and input/output formats are also explored. By the end of the course, participants will have hands-on experience in optimizing and writing efficient MapReduce jobs, preparing them to apply best practices in real-world scenarios. This course is unique as it not only covers the foundational aspects of YARN and MapReduce but also delves into optimization strategies, offering learners the tools to enhance data processing efficiency. Whether you're new to MapReduce or looking to deepen your knowledge, this course provides valuable insights for mastering large-scale data processing.

Syllabus

  • Course Introduction
    • This course provides a comprehensive introduction to YARN and MapReduce architectures, covering their fundamental components and capabilities. You will explore the MapReduce programming model, focusing on optimization techniques such as combiners, partitioners, and compression. Key concepts like Mapper and Reducer parallelism will be demonstrated, alongside practical steps for writing and configuring MapReduce jobs. The course also delves into advanced topics such as multithreading, speculative execution, and input/output formats. By the end, You will gain a deep understanding of MapReduce and be equipped to apply best practices in real-world scenarios.
  • YARN and MapReduce Architecture
    • In this module, we will cover the architecture YARN architecture and architectural capabilities followed by MapReduce architecture built on YARN
  • MapReduce Programming Basics
    • This module provides a comprehensive overview of the MapReduce API, guiding you through the steps to write a MapReduce program. It covers the concepts of Mapper and Reducer parallelism, illustrating their implementation and impact on data processing efficiency.
  • MapReduce Programming Intermediate
    • This module focuses on advanced MapReduce optimization techniques, including the use of combiners to enhance performance, partitioners to manage data distribution across reducers, and compression methods to optimize I/O. It also covers the application of counters to collect and analyze statistics about MapReduce jobs.
  • MapReduce Programming Advanced
    • This module explores advanced MapReduce concepts including multithreading, the internals of input/output formats, and speculative execution. It also covers running jobs locally and identifies common MapReduce anti-patterns to avoid.

Taught by

Karthik Shyamsunder

Reviews

Start your review of YARN MapReduce Architecture and Advanced Programming

Never Stop Learning.

Get personalized course recommendations, track subjects and courses with reminders, and more.

Someone learning on their laptop while sitting on the floor.