Class Central is learner-supported. When you buy through links on our site, we may earn an affiliate commission.

Duke University

Virtualization, Docker, and Kubernetes for Data Engineering

Duke University via Coursera

Overview

Unlock Unlimited Opportunities: Get 50% Off Your First Month of Coursera Plus
Throughout this course, you'll explore virtualization, containerization, and Kubernetes, mastering the very tools that power data engineering in the industry. Each week presents a new set of tools and platforms that are indispensable in data engineering. From mastering Docker and Kubernetes to exploring advanced topics such as AI-driven coding with GitHub Copilot, efficient container image management with Azure and Amazon Elastic Container Registries, and Site Reliability Engineering (SRE) practices, you'll go beyond the basics and acquire the expertise needed to thrive in the dynamic and data-driven landscape of advanced data engineering. Whether you're a current student looking to expand your skills or a working professional aiming to take your expertise to the next level, this course is tailored to equip you with the advanced knowledge and hands-on experience necessary for success.

Syllabus

  • Virtualization Theory and Concepts
    • In this module, you will learn about the fundamentals of virtualization, exploring its various aspects such as hardware utilization and scaling applications. You will start by understanding what virtualization is and delve into the concept of virtual machines. Through the introduction of Virtual Box and a hands-on demo, you will gain a practical understanding of how virtual machines work and their benefits. Additionally, you will explore container concepts, focusing on Docker as a key containerization tool. Through an introduction to Docker and its architecture, you will learn how to scale applications using containers, providing a comprehensive overview of virtualization and its practical applications. To apply your newfound knowledge, you will be assessed through a series of hands-on exercises involving the creation and management of virtual machines and containers, demonstrating your ability to effectively utilize virtualization technologies.
  • Using Docker
    • In this module, you will learn how to effectively work with the Docker client, create volumes, and run databases in containers, gaining hands-on experience in managing containerized applications. You will also explore how to use the Docker command line for tasks such as building images and working with Dockerfiles, enabling you to package your software efficiently. You'll get a chance to study real-life Dockerfile examples and consult the Dockerfile reference for best practices. Furthermore, you will dive into orchestration with Docker Compose, learning how to manage multi-container applications using Compose. As an extension to this, you will be introduced to Airflow, a workflow management platform, and learn how to integrate it with Docker Compose for a seamless automation experience.
  • Kubernetes: Container Orchestration in Action
    • In this module, you will embark on a comprehensive journey into Kubernetes, the cornerstone of modern container orchestration. You'll begin by grasping Kubernetes key concepts, cluster architecture, and service deployments. The advantages of cloud development environments, exemplified by GitHub Codespaces, will become more clear as you explore GitHub's ecosystem and harness AI-driven coding with GitHub Copilot and OpenAI Codewhisper. The module culminates in hands-on experience as you deploy Kubernetes using Minikube within GitHub Codespaces. Gain a solid foundation in Kubernetes essentials and the power of cloud-based development, setting the stage for successful containerized application management and collaborative coding in the modern era.
  • Building Kubernetes Solutions
    • This module immerses you in the hands-on world of Kubernetes solutions. You'll start by mastering containerization, constructing FastAPI microservices, and deploying containerized applications using Azure Container Registry and Amazon Elastic Container Registry. Next, explore options for cloud-based container orchestration, featuring Google Cloud Run and AWS Copilot, and expand your coding horizons in AWS Cloud9. Finally, address critical production issues as you delve into load testing, monitoring systems, the SRE mindset for MLOps, and the art of operationalizing microservices. This module offers a comprehensive toolkit to navigate Kubernetes in real-world scenarios, combining theory and practice to prepare you for Kubernetes success.

Taught by

Noah Gift, Kennedy Behrman and Matt Harrison

Reviews

3.6 rating at Coursera based on 16 ratings

Start your review of Virtualization, Docker, and Kubernetes for Data Engineering

Never Stop Learning.

Get personalized course recommendations, track subjects and courses with reminders, and more.

Someone learning on their laptop while sitting on the floor.