Class Central is learner-supported. When you buy through links on our site, we may earn an affiliate commission.

Pluralsight

Scalable Machine Learning with PySpark MLlib

via Pluralsight

Overview

Coursera Plus Monthly Sale: All Certificates & Courses 40% Off!


PySpark MLlib powers distributed machine learning. In this course, Scalable Machine Learning with PySpark MLlib, you’ll gain the ability to leverage Apache Spark’s distributed computing framework for your machine learning workloads. First, you’ll explore the fundamentals of Spark MLlib and the Spark ML Pipeline API, learning how it differs from single‐machine solutions. Next, you’ll discover how to perform feature engineering and build classification/regression models that can handle big datasets efficiently. Finally, you’ll learn how to tune hyperparameters and optimize performance so that your pipelines can run smoothly and quickly. When you’re finished with this course, you’ll have the skills and knowledge of PySpark MLlib needed to implement and scale out your own machine learning solutions on large datasets.

Taught by

Warner Chaves

Reviews

Start your review of Scalable Machine Learning with PySpark MLlib

Never Stop Learning.

Get personalized course recommendations, track subjects and courses with reminders, and more.

Someone learning on their laptop while sitting on the floor.