Learn about the principles of tidy data, and discover how to create and manipulate data tibbles—transforming them from source data into tidy formats.
Overview
Syllabus
Introduction
- Welcome
- What you need to know
- Using the exercise files
- What is tidy data?
- Variables, observations, and values
- Common data problems
- Using the tidyverse
- Building and printing tibbles
- Subsetting tibbles
- Filtering tibbles
- What are CSV files?
- Importing CSV files into R
- What are TSV files?
- Importing TSV files into R
- Importing delimited files into R
- Importing fixed-width files into R
- Importing Excel files into R
- Reading data from databases and the web
- Wide vs. long datasets
- Making wide datasets long with gather()
- Making long datasets wide with spread()
- Converting data types in R
- Working with dates and times in R
- Detecting outliers
- Missing and special values in R
- Breaking apart columns with separate()
- Combining columns with unite()
- Manipulating strings in R with stringr
- Understanding the coal dataset
- Reading in the coal dataset
- Converting the coal dataset from long to wide
- Segmenting the coal dataset
- Visualizing the coal dataset
- Understanding the water quality dataset
- Reading in the water quality dataset
- Filtering the water quality dataset
- Water quality data types
- Correcting data entry errors
- Identifying and removing outliers
- Converting temperature from Fahrenheit to Celsius
- Widening the water quality dataset
- Understanding the Social Security Disability dataset
- Importing the Social Security Disability dataset
- Making the Social Security Disability dataset long
- Formatting dates in the Social Security Disability dataset
- Handling fiscal years in the Social Security Disability dataset
- Widening the Social Security Disability dataset
- Visualizing the Social Security Disability dataset
- Next steps
Taught by
Mike Chapple