This course provides an introduction to basic data science techniques using Python. Students are introduced to core concepts like Data Frames and joining data, and learn how to use data analysis libraries like pandas, numpy, and matplotlib. This course provides an overview of loading, inspecting, and querying real-world data, and how to answer basic questions about that data. Students will gain skills in data aggregation and summarization, as well as basic data visualization.
Module 1 : Loading, Querying, & Filtering Data Using the csv Module
This first module provides students with an overview of loading, inspecting, and exploring data using Python's simple csv library. To get started, this module includes a brief overview of Jupyter Notebook and a concise review of basic Python, including data structures, loops, and functions. This module showcases to the students an in-depth analysis of data stored in a .csv file, including basic querying, approaches for dealing with data errors, and how to filter and sort data based on a variety of criteria.
Module 2 : Loading, Querying, Joining & Filtering Data Using pandas
In this module, students are introduced to core concepts like the Data Frame and joining data. Students will get experience using pandas, an industry-standard data analysis library, to load and query real-world data and to answer questions about that data. This module demonstrates how to do advanced filtering and indexing, slice subsets of data, restrict data attributes in query results, and do basic computations over the data. Includes how to build a simple recommendation system, and approaches for cleaning data, dealing with missing values, and creating new data.
Module 3 : Summarizing & Visualizing Data
This module takes data analysis a step further by providing an overview of the process of aggregating, summarizing, and visualizing data. Students are introduced to the concept of grouping and indexing data, and how to display results in a pivot table using pandas. This module also demonstrates how to prepare and visualize data using a histogram and scatterplot in Jupyter Notebook. Students will gain skills in data aggregation and summarization, as well as basic data visualization. In addition, students will get experience using data analysis libraries like numpy and matplotlib.