Building Data Intensive Analytic Applications on Top of Delta Lakes

Overview

This course aims to teach learners about key data reliability challenges and how Delta Lake brings reliability to data lakes at scale. Participants will understand how Delta Lake fits within an Apache Spark™ environment and how to use it to realize data reliability improvements. The teaching method includes a combination of instructor-led sessions and hands-on interactive activities. The course is intended for data engineers and practitioners looking to enhance data reliability and performance in their organizations.

Syllabus

Introduction
Data Lakes
Typical Data Lake Project
Who uses Delta
Getting started
Data
Download Data
Park Table
Stop Streaming
Initializing Streaming
Working with Parker
Using Delta Lake
Streaming Job
Multiple Streaming Queries
Counting Continuously
Schema Evolution
Merged Schema
Summary
History
Vacuum
Mods
Merge
Update Data
Define DataFrame
Merge Syntax
Random Data
For Each Batch
Summarize
Community
Question
Thank you

Taught by

Databricks

Reviews

Start your review of Building Data Intensive Analytic Applications on Top of Delta Lakes

BloomTech’s Downfall: A Long Time Coming

Most common

Popular subjects

Popular courses

Building Data Intensive Analytic Applications on Top of Delta Lakes

Overview

Syllabus

Taught by

Reviews

BloomTech’s Downfall: A Long Time Coming

Taught by

Apache Spark (TM) SQL for Data Analysts

Data Management with Databricks: Big Data with Delta Lakes

Distributed Computing with Spark SQL

Implement a Lakehouse with Microsoft Fabric

Implement a Data Analytics Solution with Azure Databricks

Perform data science with Azure Databricks

Never Stop Learning.