Class Central is learner-supported. When you buy through links on our site, we may earn an affiliate commission.

Pluralsight

Cleaning Data with Pandas

via Pluralsight

Overview

Learn to clean and manipulate data using the Pandas library in Python. Cover common issues like missing values and irrelevant features, use correlation analysis, encode categorical features, and prepare data for machine learning models.

In the real world, rarely is data organized into neat tables that can be fed directly into a machine learning model or used for data analysis. Data you find is often messy, missing many values, and generally tends to have multiple other issues that you need to solve before gaining any sort of meaningful inference from it. In this course, Cleaning Data with Pandas, you will learn how to use the Pandas library in Python to clean and manipulate data. First, you will understand what data cleaning is and why it is so important in the context of data analysis. Then, you will solve the most common issues plaguing datasets - missing values, irrelevant features, and duplicate values. Next, you will see what correlation analysis is and how it helps in data cleaning. Finally, you will see how to encode categorical features and prepare your dataset to be fed into machine learning models. When you’re finished with this course, you will have the skills and knowledge you need to effectively clean and manipulate data using Pandas.

Taught by

Pratheerth Padman

Reviews

4.7 rating at Pluralsight based on 23 ratings

Start your review of Cleaning Data with Pandas

Never Stop Learning.

Get personalized course recommendations, track subjects and courses with reminders, and more.

Someone learning on their laptop while sitting on the floor.