Class Central is learner-supported. When you buy through links on our site, we may earn an affiliate commission.

YouTube

Hadoop Administration - Cloudera Hadoop on AWS

via YouTube

Overview

This course covers the following learning outcomes and goals: understanding Amazon Web Services (AWS) basics, setting up Hadoop on AWS, learning Hadoop administration with CDH5 on AWS, configuring HDFS, mastering HDFS commands, setting up MapReduce v1 and MRv2 + YARN, and preparing for Hadoop certifications (HDPCA, CCAH). The course teaches individual skills such as AWS networking, key pair authentication, EC2 pricing, Hadoop setup on AWS, HDFS management, MapReduce configuration, Hadoop ecosystem tools setup, and cluster planning. The teaching method includes demonstrations, setup tutorials, command overviews, reviews of components, and validation exercises. The intended audience for this course includes IT professionals, system administrators, data engineers, and individuals preparing for Hadoop certifications.

Syllabus

Amazon Web Services - Sign up and Regions.
Amazon Web Services - Networking.
Amazon Web Services - Key Pair (authentication) and Security Groups (firewalls).
Amazon Web Services - Storage.
Amazon Web Services - EC2 Pricing (Very Important).
Amazon Web Services - Provision EC2 Instance Demo.
Amazon Web Services - Setup SSH forwarding (Bastion).
Amazon Web Services - Mac - Setup SSH Forwarding (Bastion).
Amazon Web Services - Setup SSH Tunneling and Foxyproxy.
Big Data Introduction.
Hadoop Introduction and brief comparison with Oracle.
Hadoop Administration - CDH5 on AWS - Introduction.
Setup CDH on AWS - Provision EC2 Instances.
Setup CDH on AWS - Setup parallel ssh.
Setup CDH on AWS - Setup http server on master01 or gateway node.
Setup CDH on AWS - Setup local yum repository server for Cloudera Manager and Hadoop.
Setup CDH5 on AWS - Setup pre-requisites using parallel-ssh.
Setup CDH5 on AWS - Install Cloudera Manager.
Setup CDH on AWS - Review Cloudera Management Service Components.
Setup CDH on AWS - Setup HDFS.
HDFS - Files and blocks - dfs.blocksize (Block Size).
HDFS - Replication Factor - Fault Tolerance.
HDFS - Metadata, Datanode, Namenode and Secondary Namenode.
HDFS - Heartbeat, Block report and Checksum.
HDFS - Namenode Recovery (role of editlogs, fsimage and secondary namenode).
Setup CDH on AWS - Setup HDFS High Availability Introduction.
Setup CDH on AWS - Setup HDFS High Availability using Cloudera Manager.
HDFS - High Availability - Setup using Cloudera Manager.
HDFS - High Availability - Review components and parameter files.
HDFS Commands - hadoop fs command overview, help and appendToFile.
HDFS Commands - cat, checksum, chgrp, chmod, chown.
HDFS Commands - copyFromLocal or put, copyToLocal or get and cp.
HDFS Commands - count, df, du and expunge.
HDFS Commands - find, getmerge and ls.
Hadoop Certification - HDPCA - Recover a snapshot.
Hadoop Certification - HDPCA - Create a snapshot of an HDFS directory.
Hadoop Certification - HDPCA - Configure ACLs.
HDFS Commands - mkdir, moveFromLocal, moveToLocal, mv.
HDFS Commands - stat, tail, test, text, touchz and usage.
Setup Map Reduce v1 (MRv1) or Classic using Cloudera Distribution.
Setup Map Reduce v1 (MRv1) or Classic using Cloudera Distribution - Review.
Setup Map Reduce v1 (MRv1) or Classic using Cloudera Distribution - JobLifeCycle.
Setup Map Reduce v1 (MRv1) - Heartbeat, Fault Tolerance and Speculative Execution.
Setup Map Reduce v1 (MRv1) - Challenges.
Setup CDH on AWS - Configure MRv2 + YARN.
Setup CDH on AWS - Configure MRv2 + YARN - Review.
Setup CDH on AWS - Configure MRv2 + YARN - Validate.
Setup CDH on AWS - Configure MRv2 + YARN - Map Reduce Job Life Cycle.
Setup CDH on AWS - Configure MRv2 + YARN - Fault Tolerance.
Hadoop Certification - CCAH - Principal points to consider hardware and OS for Hadoop cluster.
Hadoop Certification - CCAH - Cluster planning.
Hadoop Certification - CCAH - Logging Configuration (log4j.properties).
Hadoop Certification - CCAH - Setup Hadoop eco system tools - Introduction.
Hadoop Certification - CCAH - Hadoop metrics and cluster health monitoring.
Setup CDH on AWS - Setup mysql for hive, oozie, sqoop etc..
Setup CDH - Setup Pig using Cloudera Manager.
Setup CDH on AWS - Setup Hive using Cloudera Manager.
Setup CDH on AWS - Setup Hive using Cloudera Manager - Review.
Setup CDH on AWS - Setup Hive using Cloudera Manager - Validate.
Setup CDH on AWS - Setup Sqoop using Cloudera Manager.
Setup CDH - Schedulers Overview.
Setup CDH - FIFO Scheduler.
Setup CDH - Fair Scheduler.
Setup CDH - Capacity Scheduler.

Taught by

itversity

Reviews

Start your review of Hadoop Administration - Cloudera Hadoop on AWS

Never Stop Learning.

Get personalized course recommendations, track subjects and courses with reminders, and more.

Someone learning on their laptop while sitting on the floor.