Class Central is learner-supported. When you buy through links on our site, we may earn an affiliate commission.

Pluralsight

Performance Optimization in Databricks

via Pluralsight

Overview

Coursera Plus Monthly Sale: All Certificates & Courses 40% Off!


Databricks performance optimization ensures efficient, scalable, and cost-effective data processing in Delta Tables and parquet file outputs. Best practices evolve over time and Databricks is keeping pace for the open-source community. In this course, Performance Optimization in Databricks, you will dive into causes and solutions for performance issues like skewed data and long processing queries. First, you will see strategies like z-ordering and using the optimize method for compact data storage. Next, you will learn techniques for query optimization, including best practices for writing efficient SQL and partitioning strategies to reduce execution time. Finally, you will investigate cluster configuration for resource allocation, choosing the right cluster size with auto scaling, and leveraging Databricks’ compute options like Photon for enhanced processing speed. When you are finished with this course, you will have the skills needed to performance-tune processes and enhance your Databricks environment while minimizing cost.

Taught by

Thomas LeBlanc

Reviews

Start your review of Performance Optimization in Databricks

Never Stop Learning.

Get personalized course recommendations, track subjects and courses with reminders, and more.

Someone learning on their laptop while sitting on the floor.