Class Central is learner-supported. When you buy through links on our site, we may earn an affiliate commission.

freeCodeCamp

Data Engineering with Python and AI/LLMs - Data Loading Tutorial

via freeCodeCamp

Overview

Coursera Plus Monthly Sale: All Certificates & Courses 40% Off!
Master data ingestion techniques for data engineering with Python in this comprehensive tutorial. Develop the mindset of a senior platform engineer to overcome common pipeline failures like schema changes and API limits. Learn essential skills including extracting data from APIs, implementing automatic schema management, setting up incremental loading, and orchestrating scalable, automated workflows using modern tools. The tutorial is divided into two main sections taught by Alexey Grigorev and Adrian Brudaru, covering fundamentals of data ingestion, working with REST APIs, normalizing data, loading into databases like DuckDB, dynamic schema management, data contracts, incremental loading, deployment strategies with Git Actions, Crontab, Dagster and Airflow, and creating pipelines with LLMs. Access the complete code on GitHub and follow along with this nearly 4-hour hands-on learning experience that was made possible through a grant from dlthub.com.

Syllabus

0:00:00 1. Introduction
0:08:02 2. What is data ingestion
0:10:04 3. Extracting data: Data Streaming & Batching
0:14:00 4. Extracting data: Working with RestAPI
0:29:36 5. Normalizing data
0:43:41 6. Loading data into DuckDB
0:48:39 7. Dynamic schema management
0:56:26 8. What is next?
0:56:36 1. Introduction
0:59:29 2. Overview
1:02:08 3. Extracting data with dlt: dlt RestAPI Client
1:08:05 4. dlt Resources
1:10:42 5. How to configure secrets
1:15:12 6. Normalizing data with dlt
1:24:09 7. Data Contracts
1:31:05 8. Alerting schema changes
1:33:56 9. Loading data with dlt
1:33:56 10. Write dispositions
1:37:34 11. Incremental loading
1:43:46 12. Loading data from SQL database to SQL database
1:47:46 13. Backfilling
1:50:42 14. SCD2
1:54:29 15. Performance tuning
2:03:12 16. Loading data to Data Lakes & Lakehouses & Catalogs
2:12:17 17. Loading data to Warehouses/MPPs,Staging
2:18:15 18. Deployment & orchestration
2:18:15 19. Deployment with Git Actions
2:29:04 20. Deployment with Crontab
2:40:05 21. Deployment with Dagster
2:49:47 22. Deployment with Airflow
3:07:00 23. Create pipelines with LLMs: Understanding the challenge
3:10:35 24. Create pipelines with LLMs: Creating prompts and LLM friendly documentation
3:31:38 25. Create pipelines with LLMs: Demo

Taught by

freeCodeCamp.org

Reviews

Start your review of Data Engineering with Python and AI/LLMs - Data Loading Tutorial

Never Stop Learning.

Get personalized course recommendations, track subjects and courses with reminders, and more.

Someone learning on their laptop while sitting on the floor.