Class Central is learner-supported. When you buy through links on our site, we may earn an affiliate commission.

DataCamp

Working with Categorical Data in Python

via DataCamp

Overview

Learn how to manipulate and visualize categorical data using pandas and seaborn.

Being able to understand, use, and summarize non-numerical data—such as a person’s blood type or marital status—is a vital component of being a data scientist. In this course, you’ll learn how to manipulate and visualize categorical data using pandas and seaborn. Through hands-on exercises, you’ll get to grips with pandas' categorical data type, including how to create, delete, and update categorical columns. You’ll also work with a wide range of datasets including the characteristics of adoptable dogs, Las Vegas trip reviews, and census data to develop your skills at working with categorical data.

Syllabus

Introduction to Categorical Data
-Almost every dataset contains categorical information—and often it’s an unexplored goldmine of information. In this chapter, you’ll learn how pandas handles categorical columns using the data type category. You’ll also discover how to group data by categories to unearth great summary statistics.

Categorical pandas Series
-Now it’s time to learn how to set, add, and remove categories from a Series. You’ll also explore how to update, rename, collapse, and reorder categories, before applying your new skills to clean and access other data within your DataFrame.

Visualizing Categorical Data
-In this chapter, you’ll use the seaborn Python library to create informative visualizations using categorical data—including categorical plots (cat-plot), box plots, bar plots, point plots, and count plots. You’ll then learn how to visualize categorical columns and split data across categorical columns to visualize summary statistics of numerical columns.

Pitfalls and Encoding
-Lastly, you’ll learn how to overcome the common pitfalls of using categorical data. You’ll also grow your data encoding skills as you are introduced to label encoding and one-hot encoding—perfect for helping you prepare your data for use in machine learning algorithms.

Taught by

Kasey Jones

Reviews

4.6 rating at DataCamp based on 27 ratings

Start your review of Working with Categorical Data in Python

Never Stop Learning.

Get personalized course recommendations, track subjects and courses with reminders, and more.

Someone learning on their laptop while sitting on the floor.