Learn how to architect and implement large-scale ETL processes for third-party APIs through a Hebrew-language technical talk that compares different distributed workload approaches. Explore the advantages and limitations of Apache Airflow, multi-threaded Python scripts, and AWS Lambda functions for handling data processing at scale. Gain practical insights into architectural considerations and best practices for choosing the right tool based on specific use cases and requirements. Master key concepts for optimizing performance and resource utilization when building distributed ETL pipelines.
Airflow Distributed Workloads - ETL Architecture with AWS Lambda and Multi-Threaded Python Scripts
Big Data Demystified via YouTube
Overview
Syllabus
Airflow Distributed workloads, AWS Lambda, Multi Threaded Python Script [Hebrew]
Taught by
Big Data Demystified