Architecting Big Data Solutions Using Google Dataproc

Overview

Dataproc is Google’s managed Hadoop offering on the cloud. This course teaches you how the separation of storage and compute allows you to utilize clusters more efficiently purely for processing data and not for storage.

When organizations plan their move to the Google Cloud Platform, Dataproc offers the same features but with additional powerful paradigms such as separation of compute and storage. Dataproc allows you to lift-and-shift your Hadoop processing jobs to the cloud and store your data separately on Cloud Storage buckets, thus effectively eliminating the requirement to keep your clusters always running. In this course, Architecting Big Data Solutions Using Google Dataproc, you’ll learn to work with managed Hadoop on the Google Cloud and the best practices to follow for migrating your on-premise jobs to Dataproc clusters. First, you'll delve into creating a Dataproc cluster and configuring firewall rules to enable you to access the cluster manager UI from your local machine. Next, you'll discover how to use the Spark distributed analytics engine on your Dataproc cluster. Then, you'll explore how to write code in order to integrate your Spark jobs with BigQuery and Cloud Storage buckets using connectors. Finally, you'll learn how to use your Dataproc cluster to perform extract, transform, and load operations using Pig as a scripting language and work with Hive tables. By the end of this course, you'll have the necessary knowledge to work with Google’s managed Hadoop offering and have a sound idea of how to migrate jobs and data on your on-premise Hadoop cluster to the Google Cloud.

Topics:

Course Overview
Introducing Google Dataproc for Big Data on the Cloud
Running Hadoop MapReduce Jobs on Google Dataproc
Working with Apache Spark on Google Dataproc
Working with Pig and Hive on Google Dataproc

Taught by

Janani Ravi

Reviews

4.9 rating at Pluralsight based on 14 ratings

Start your review of Architecting Big Data Solutions Using Google Dataproc

Udemy, Coursera, 2U/edX Face Lawsuits Over Meta Pixel Use

Most common

Popular subjects

Popular courses

Architecting Big Data Solutions Using Google Dataproc

Overview

Taught by

Reviews

Udemy, Coursera, 2U/edX Face Lawsuits Over Meta Pixel Use

Taught by

Leveraging Unstructured Data with Cloud Dataproc on Google Cloud Platform

Leveraging Unstructured Data with Cloud Dataproc on Google Cloud em Português Brasileiro

Introduction to Cloud Dataproc: Hadoop and Spark on Google Cloud

Introduction to Cloud Dataproc: Hadoop and Spark on Google Cloud

Architecting Big Data Solutions Using Google Bigtable

The Ultimate Hands-On Hadoop: Tame your Big Data!

Top 100 Pluralsight Courses of All Time

Never Stop Learning.