Class Central is learner-supported. When you buy through links on our site, we may earn an affiliate commission.

University of Colorado Boulder

Data Mining Pipeline

University of Colorado Boulder via Coursera


This course introduces the key steps involved in the data mining pipeline, including data understanding, data preprocessing, data warehousing, data modeling, interpretation and evaluation, and real-world applications.

Data Mining Pipeline can be taken for academic credit as part of CU Boulder’s Master of Science in Data Science (MS-DS) degree offered on the Coursera platform. The MS-DS is an interdisciplinary degree that brings together faculty from CU Boulder’s departments of Applied Mathematics, Computer Science, Information Science, and others. With performance-based admissions and no application process, the MS-DS is ideal for individuals with a broad range of undergraduate education and/or professional experience in computer science, information science, mathematics, and statistics. Learn more about the MS-DS program at

Course logo image courtesy of Francesco Ungaro, available here on Unsplash:


  • Data Mining Pipeline
    • This module provides an introduction to data mining and data mining pipeline, including the four views of data mining and the key components in the data mining pipeline.
  • Data Understanding
    • This module covers data understanding by identifying key data properties and applying techniques to characterize different datasets.
  • Data Preprocessing
    • This module explains why data preprocessing is needed and what techniques can be used to preprocess data.
  • Data Warehousing
    • This module covers the key characteristics of data warehousing and the techniques to support data warehousing.

Taught by

Qin (Christine) Lv


Start your review of Data Mining Pipeline

Never Stop Learning.

Get personalized course recommendations, track subjects and courses with reminders, and more.