Name: Spark and Data Lakes
Price: -1 USD

Spark and Data Lakes

Overview

In this course, you will learn about the big data ecosystem and how to use Spark to work with massive datasets. You’ll also learn about how to store big data in a data lake and query it with Spark.

Syllabus

Introduction to Spark and Data Lakes

In this course you'll learn how Spark evaluates code and uses distributed computing to process and transform data. You'll work in the big data ecosystem to build data lakes and data lake houses.

Big Data Ecosystem, Data Lakes, and Spark

In this lesson, you will learn about the problems that Apache Spark is designed to solve. You'll also learn about the greater Big Data ecosystem and how Spark fits into it.

Spark Essentials

In this lesson, we'll dive into how to use Spark for wrangling, filtering, and transforming distributed data with PySpark and Spark SQL

Using Spark in AWS

In this lesson, you will learn to use Spark and work with data lakes with Amazon Web Services using S3, AWS Glue, and AWS Glue Studio.

Ingesting and Organizing Data in a Lakehouse

In this lesson you'll work with Lakehouse zones. You will build and configure these zones in AWS.

Project: STEDI Human Balance Analytics

In this project, you'll work with sensor data that trains a machine learning model. You'll load S3 JSON data from a data lake into Athena tables using Spark and AWS Glue.

Taught by

Sean Murdock - Instructor

Reviews

Start your review of Spark and Data Lakes

Udemy, Coursera, 2U/edX Face Lawsuits Over Meta Pixel Use

Most common

Popular subjects

Popular courses

Spark and Data Lakes

Overview

Syllabus

Taught by

Reviews

Udemy, Coursera, 2U/edX Face Lawsuits Over Meta Pixel Use

Taught by

Data lakes and Lakehouses with Spark and Azure Databricks

Distributed Computing with Spark SQL

Introduction to Designing Data Lakes on AWS

Big Data, Hadoop, and Spark Basics

Introduction to Designing Data Lakes on AWS

Data Engineering with AWS

Massive List of MOOC-based Microcredentials

Never Stop Learning.