By the end of this project, you will learn how to load and extract useful information from your dataset using Python, a free, open-source program that you can download. You will then learn how to clean your data set by removing unwanted whitespaces, columns containing several empty values, rows containing empty column values and duplicated row entries. Next, you will create various visualizations to identify patterns and outliers in your dataset, and visualize correlations between different columns. Lastly, you will learn how to visualize a highly dimensional dataset using principal component analysis (PCA).
These steps are part of exploratory data analysis that you will need to carry out for any data science project to help you understand your dataset.
Note: This course works best for learners who are based in the North America region. We’re currently working on providing the same experience in other regions.