Spark

Overview

In this course, you’ll learn how to use Spark to work with big data and build machine learning models at scale, including how to wrangle and model massive datasets with PySpark, the Python library for interacting with Spark. In the first lesson, you will learn about big data and how Spark fits into the big data ecosystem. In lesson two, you will be practicing processing and cleaning datasets to get comfortable with Spark’s SQL and dataframe APIs. In the third lesson, you will debug and optimize your Spark code when running on a cluster. In lesson four, you will use Spark’s Machine Learning Library to train machine learning models at scale.

Syllabus

The Power of Spark

Understand the big data ecosystem,Understand when to use Spark and when not to use it

Data Wrangling with Spark

Manipulate data with SparkSQL and Spark Dataframes,Use Spark for wrangling massive datasets

Debugging and Optimization

Troubleshoot common errors and optimize their code using the Spark WebUI

Machine Learning with Spark

Use Spark’s Machine Learning Library to train machine learning models at scale

Taught by

David Drummond and Judit Lantos

Reviews

Start your review of Spark

Udemy, Coursera, 2U/edX Face Lawsuits Over Meta Pixel Use

Most common

Popular subjects

Popular courses

Spark

Overview

Syllabus

Taught by

Reviews

Udemy, Coursera, 2U/edX Face Lawsuits Over Meta Pixel Use

Taught by

Introduction to Spark with sparklyr in R

Big Data Fundamentals with PySpark

Scalable Machine Learning on Big Data using Apache Spark

Spark and Data Lakes

Big Data, Hadoop, and Spark Basics

Perform data science with Azure Databricks

10 Best Applied AI & ML Courses

Massive List of MOOC-based Microcredentials

Never Stop Learning.