Class Central is learner-supported. When you buy through links on our site, we may earn an affiliate commission.

Coursera Project Network

Exploratory Data Analysis with Textual Data in R / Quanteda

Coursera Project Network via Coursera

Overview

In this 1-hour long project-based course, you will learn how to explore presidential concession speeches by US presidential candidates over time, looking specifically at speech length and top words and examining variation by Democrat and Republican candidates. You will learn how to import textual data stored in raw text files, turn these files into a corpus (a collection of textual documents) and tokenize the text all using the software package quanteda. You will also learn how to extract useful information from filenames and how to use this information to generate visualizations of textual data using the stringr and ggplot2 packages. Note: This course works best for learners who are based in the North America region. We’re currently working on providing the same experience in other regions.

Syllabus

  • Exploratory Data Analysis with Textual Data in R using Quanteda
    • By the end of this project, you will learn how to import textual data stored in raw text files into R, turn these files into a corpus (a collection of textual documents) and tokenize the text all using the R software package quanteda. You will also learn how to extract useful information from filenames and how to use this information to generate visualizations of textual data using the stringr and ggplot packages in R. At the end of this project, among other things you will explore presidential concession speeches by US presidential candidates over time, looking specifically at speech length and top words and examining variation by Democrat and Republican candidates. This guided project is for beginners interested in quantitative text analysis in R. It assumes no knowledge of textual analysis and focuses on exploring textual data (US Presidential Concession Speeches). Users should have a basic understanding of the statistical programming language R. By the end of the exercise, learners will know how to load textual data into R, summarize the data using descriptive quantities of interest, turn text into tokens, and visualize changes over time as well as top words.. Familiarity with R including stringr and ggplot is useful but not essential.

Taught by

Nicole Baerg

Reviews

Start your review of Exploratory Data Analysis with Textual Data in R / Quanteda

Never Stop Learning.

Get personalized course recommendations, track subjects and courses with reminders, and more.

Someone learning on their laptop while sitting on the floor.