Designing a data lake is challenging because of the scale and growth of data. Developers need to understand best practices to avoid common mistakes that could be hard to rectify. In this course we will cover the foundations of what a Data Lake is, how to ingest and organize data into the Data Lake, and dive into the data processing that can be done to optimize performance and costs when consuming the data at scale. This course is for professionals (Architects, System Administrators and DevOps) who need to design and build an architecture for secure and scalable Data Lake components. Students will learn about the use cases for a Data Lake and, contrast that with a traditional infrastructure of servers and storage.
Week 1: Hello World, I mean, Hello Data Lakes!
Video: Meet the Instructors
Video: Introduction to Week 1
Video: Why Data Lakes?
Video: Characteristics of a Data Lake
Video: Data Lake Components
Reading: Data Lake Characteristics and Components
Video: Comparison of a Data Lake to a Data Warehouse
Reading: Data Lakes and Data Warehouses
Video: Discussing sample Data Lake Architectures
Quiz/Assessment: Week 1 quiz
Week 2: AWS data related services
Video: Introduction to Week 2
Video: AWS Data Lake related services
Video: Amazon S3
Video: AWS Glue Data Catalog
Reading: S3 and Glue Data Catalog
Video: AWS Services used for data movement
Reading: Kinesis, API Gateway, etc
Video: AWS Services for Data processing
Video: AWS Services for Analytics
Video: AWS Services used for Predictive Analytics and Machine Learning