Class Central is learner-supported. When you buy through links on our site, we may earn an affiliate commission.

University of Michigan

The Total Data Quality Framework

University of Michigan via Coursera

Overview

Prepare for a new career with $100 off Coursera Plus
Gear up for jobs in high-demand fields: data analytics, digital marketing, and more.
By the end of this first course in the Total Data Quality specialization, learners will be able to:
1. Identify the essential differences between designed and gathered data and summarize the key dimensions of the Total Data Quality (TDQ) Framework;
2. Define the three measurement dimensions of the Total Data Quality framework, and describe potential threats to data quality along each of these dimensions for both gathered and designed data;
3. Define the three representation dimensions of the Total Data Quality framework, and describe potential threats to data quality along each of these dimensions for both gathered and designed data; and
4. Describe why data analysis defines an important dimension of the Total Data Quality framework, and summarize potential threats to the overall quality of an analysis plan for designed and/or gathered data.

This specialization as a whole aims to explore the Total Data Quality framework in depth and provide learners with more information about the detailed evaluation of total data quality that needs to happen prior to data analysis. The goal is for learners to incorporate evaluations of data quality into their process as a critical component for all projects. We sincerely hope to disseminate knowledge about total data quality to all learners, such as data scientists and quantitative analysts, who have not had sufficient training in the initial steps of the data science process that focus on data collection and evaluation of data quality. We feel that extensive knowledge of data science techniques and statistical analysis procedures will not help a quantitative research study if the data collected/gathered are not of sufficiently high quality.

This specialization will focus on the essential first steps in any type of scientific investigation using data: either generating or gathering data, understanding where the data come from, evaluating the quality of the data, and taking steps to maximize the quality of the data prior to performing any kind of statistical analysis or applying data science techniques to answer research questions. Given this focus, there will be little material on the analysis of data, which is covered in myriad existing Coursera specializations. The primary focus of this specialization will be on understanding and maximizing data quality prior to analysis.

Syllabus

  • Introduction, Different Types of Data and the Total Data Quality Framework
    • Welcome to the Total Data Quality Framework Course! This is the first course in the Total Data Quality Specialization. This week, you’ll get to know your instructors after reviewing the course syllabus and the learning goals. We will then introduce you to the basic components of the Total Data Quality (TDQ) Framework through a series of video lectures, including Designed Data, Gathered Data, and Hybrid Data. Next, we’ll provide a high-level overview of the TDQ Framework and incorporate the perspectives of global TDQ experts in both a lecture and an interview. We’ll then wrap up the week with a short quiz about measurement and representation concepts.
  • Measurement Dimensions of Total Data Quality: Validity, Data Origin, and Data Processing
    • In Week 2, we’ll explore the concepts of validity, data origin, and data processing. First, we’ll define validity and discuss threats to validity for designed data and gathered data. We’ll also explore validity through an interview, a real-world application, and a case study. After taking a short quiz to test your knowledge of validity, you’ll then move to the data origin module. We’ll define data processing and explore data origin threats for designed and gathered data through a series of video lectures and case studies. The data processing module will conclude with a short quiz. Week 2 will conclude with an exploration of data processing; data processing threats for designed and gathered data; case studies; and a quiz to check your understanding of data processing.
  • Representation Dimensions of Total Data Quality: Data Access, Data Source, and Data Missingness
    • This week, we’ll be exploring three representation dimensions of the TDQ framework along with potential threats to data quality. First, we’ll define and discuss data access - as well as data access threats for gathered and designed data - through a series of video lectures, readings, and case studies. After you complete a quiz on data access, we’ll then define data sources and explore data threats for designed and gathered data, along with two case studies. Lastly, we’ll define data missingness along with data missingness threats for designed and gathered data, and then conclude the week with a quiz.
  • Data Analysis as an Important Aspect of TDQ
    • We’ll be wrapping up the Total Data Quality Framework course this week. We’ll be discussing why data analysis is a critical dimension of the TDQ framework and threats to data analysis quality for designed and gathered data. You’ll also be reviewing several case studies and will be able to complete an optional tutorial using free R software. After a short quiz on data analysis threats, we’ll conclude the course with a list of references from across Course 1 and we’ll ask you to complete a course survey.

Taught by

Brady T. West, James Wagner, Jinseok Kim and Trent D Buskirk

Reviews

Start your review of The Total Data Quality Framework

Never Stop Learning.

Get personalized course recommendations, track subjects and courses with reminders, and more.

Someone learning on their laptop while sitting on the floor.