PySpark combines the powerful parallel computing platform of Spark with the very popular Python language to enable the analysis of large datasets. In this course, Big Data Analytics with PySpark, you’ll gain the ability to tackle the transformation and analysis of large datasets with this popular API. First, you’ll explore ingesting and transforming datasets with PySpark. Next, you’ll discover how to do analysis and aggregation of datasets as well as optimizing the performance of the PySpark operations. Finally, you’ll learn how to visualize and export the results of your analysis. When you’re finished with this course, you’ll have the skills and knowledge of PySpark needed to tackle your next big data analytics project.
Overview
PySpark combines the powerful parallel computing platform of Spark with the very popular Python language to enable the analysis of large datasets. In this course, Big Data Analytics with PySpark, you’ll gain the ability to tackle the transformation and analysis of large datasets with this popular API. First, you’ll explore ingesting and transforming datasets with PySpark. Next, you’ll discover how to do analysis and aggregation of datasets as well as optimizing the performance of the PySpark operations. Finally, you’ll learn how to visualize and export the results of your analysis. When you’re finished with this course, you’ll have the skills and knowledge of PySpark needed to tackle your next big data analytics project.
Taught by
Warner Chaves