Introduction to Data Engineering
DeepLearning.AI and Amazon Web Services via Coursera
-
611
-
- Write review
Overview
Class Central Tips
In this course, you will be introduced to the data engineering lifecycle, from data generation in source systems, to ingestion, transformation, storage, and serving data to downstream stakeholders. You’ll study the key undercurrents that affect all stages of the lifecycle, and start developing a framework for how to think like a data engineer. To gain hands-on practice, you’ll gather stakeholder needs, translate those needs into system requirements, and choose tools and technologies to build systems that provide business value. By the end of this course you’ll be spinning up batch and streaming data pipelines to serve product recommendations on the AWS cloud!
Syllabus
- Introduction to Data Engineering
- Gain a high-level overview of the data engineering lifecycle and key undercurrents to understand how data engineers add business value to organizations. Start developing a mental framework for thinking like a data engineer, starting with gathering stakeholder needs and translating them into system requirements. Learn the basics of working on the cloud from an AWS expert.
- The Data Engineering Lifecycle and Undercurrents
- Dive deeper into the stages of the data engineering lifecycle and its key undercurrents. Build an end-to-end data pipeline on AWS that encompasses all the stages of the data engineering lifecycle.
- Data Architecture
- Define data architecture and how it fits within the larger enterprise architecture. Examine the principles of good data architecture and how these principles inform tools and technology choices. Evaluate and optimize the security, performance, reliability, cost-efficiency, and scalability of a web application hosted on AWS.
- Translating Requirements to Architecture
- Practice gathering stakeholder needs and translating them into system requirements. Choose the appropriate tools and technologies based on the system requirements, then build an end-to-end data system that includes a batch and a streaming component to train a product recommendation system and serves product recommendations to a sales platform.
Taught by
Joe Reis