Class Central is learner-supported. When you buy through links on our site, we may earn an affiliate commission.

Amazon Web Services

Introduction to Designing Data Lakes on AWS

Amazon Web Services via Coursera

Overview

Prepare for a new career with $100 off Coursera Plus
Gear up for jobs in high-demand fields: data analytics, digital marketing, and more.
In this class, Introduction to Designing Data Lakes on AWS, we will help you understand how to create and operate a data lake in a secure and scalable way, without previous knowledge of data science! Starting with the "WHY" you may want a data lake, we will look at the Data-Lake value proposition, characteristics and components.

Designing a data lake is challenging because of the scale and growth of data. Developers need to understand best practices to avoid common mistakes that could be hard to rectify. In this course we will cover the foundations of what a Data Lake is, how to ingest and organize data into the Data Lake, and dive into the data processing that can be done to optimize performance and costs when consuming the data at scale. This course is for professionals (Architects, System Administrators and DevOps) who need to design and build an architecture for secure and scalable Data Lake components. Students will learn about the use cases for a Data Lake and, contrast that with a traditional infrastructure of servers and storage.

Syllabus

  • Week 1
    • Welcome to the course! In Week 1, you'll discover why you may want a Data Lake, its characteristics and components, and how it compares to other data data scenarios, such as databases and data warehouses.
  • Week 2
    • In Week 2, you'll build on your knowledge of what data lakes are and why they may be a solution for your needs. You'll explore AWS services that can be used in data lake architectures, like Amazon S3, AWS Glue, Amazon Athena, Amazon Elasticsearch Service, LakeFormation, Amazon Rekognition, API Gateway and other services used for data movement, processing and visualization.
  • Week 3
    • In Week 3, you'll explore specifics of data cataloging and ingestion, and learn about services like AWS Transfer Family, Amazon Kinesis Data Streams, Kinesis Firehose, Kinesis Analytics, AWS Snow Family, AWS Glue Crawlers, and others. You'll also discover when is the right time to process data--before, after, or while data is being ingested. Given scenarios, you'll be able to easily identify when to process data and match the most appropriate AWS services to each scenario.
  • Week 4
    • In Week 4, you are going to dive deeper into data optimization and data processing. Demos around best practices will show you how to optimize your dataset for performance and cost--just by using the right tool for the job! You will also discover data security, data visualization tools, and AWS datasets you can use to experiment and get started.

Taught by

Morgan Willis and Rafael Lopes

Reviews

4.7 rating at Coursera based on 205 ratings

Start your review of Introduction to Designing Data Lakes on AWS

Never Stop Learning.

Get personalized course recommendations, track subjects and courses with reminders, and more.

Someone learning on their laptop while sitting on the floor.