Class Central is learner-supported. When you buy through links on our site, we may earn an affiliate commission.

Machine Learning Data Lifecycle in Production via Coursera


In the second course of Machine Learning Engineering for Production Specialization, you will build data pipelines by gathering, cleaning, and validating datasets and assessing data quality; implement feature engineering, transformation, and selection with TensorFlow Extended and get the most predictive power out of your data; and establish the data lifecycle by leveraging data lineage and provenance metadata tools and follow data evolution with enterprise data schemas.

Understanding machine learning and deep learning concepts is essential, but if you’re looking to build an effective AI career, you need production engineering capabilities as well. Machine learning engineering for production combines the foundational concepts of machine learning with the functional expertise of modern software development and engineering roles to help you develop production-ready skills.

Week 1: Collecting, Labeling, and Validating data
Week 2: Feature Engineering, Transformation, and Selection
Week 3: Data Journey and Data Storage
Week 4: Advanced Data Labeling Methods, Data Augmentation, and Preprocessing Different Data Types


  • Week 1: Collecting, Labeling and Validating Data
    • This week covers a quick introduction to machine learning production systems. More concretely you will learn about leveraging the TensorFlow Extended (TFX) library to collect, label and validate data to make it production ready.
  • Week 2: Feature Engineering, Transformation and Selection
    • Implement feature engineering, transformation, and selection with TensorFlow Extended by encoding structured and unstructured data types and addressing class imbalances
  • Week 3: Data Journey and Data Storage
    • Understand the data journey over a production system’s lifecycle and leverage ML metadata and enterprise schemas to address quickly evolving data.
  • Week 4 (Optional): Advanced Labeling, Augmentation and Data Preprocessing
    • Combine labeled and unlabeled data to improve ML model accuracy and augment data to diversify your training set.

Taught by

Robert Crowe

Related Courses


Start your review of Machine Learning Data Lifecycle in Production

Never Stop Learning!

Get personalized course recommendations, track subjects and courses with reminders, and more.

Sign up for free