Overview

This course aims to teach learners how to analyze money in politics using Python. By the end of the course, students will be able to configure their computer for Python, work with pandas and Jupyter Notebook for data analysis, download and analyze campaign data, conduct data analysis using pandas, and publish their analysis online. The course covers skills such as setting up Python environments, using pandas for data manipulation, conducting data analysis, version control with git, and publishing work on the Internet. The teaching method involves hands-on projects using Python, pandas, and Jupyter Notebook. This course is intended for data journalists or individuals interested in using Python for data analysis in the context of political funding.

Syllabus

Module 0: Hello world
In this introductory module, you will learn how to configure your computer to work with Python. Before you can use it analyze data, your computer needs the following tools installed:
A command-line interface to interact with your computer
The git version control software and a GitHub account
Version 2.7 of the Python programming language
The pip package manager and virtualenv environment manager for Python
A code compiler that can install our heavy-duty analysis tools

Module 1: Hello notebook
This week you will learn how to start a new Python analysis project and introduce you to pandas and the Jupyter Notebook. You will use them to draft an elementary data analysis that is clear and reproducible.
Creating a new Python workspace with virtualenv
Using pip to install the pandas and Jupyter Notebook libraries
Creating your first Jupyter Notebook
How to write Python code in a Jupyter NotebookImporting the pandas library into the Jupyter Notebook

Module 2: Hello data
This week you will download a list of campaign contributors published by the California Civic Data Coalition and load it into a Jupyter Notebook for analysis with pandas. This class will cover:
Learning how the money funding campaigns is tracked in the United States
Downloading campaign data from the California Civic Data Coalition website
Importing structured data files as a DataFrame with pandas’ read_csv method
Inspecting DataFrames with pandas’ info and head methods
Inspecting and summarizing DataFrame columns with pandas’ value_counts and describe and sum methods

Module 3: Hello analysis
This week you will learn how to use pandas to conduct a data analysis and document your work with the Jupyter Notebook. It will cover:
Filtering a DataFrame with pandas’ indexing system
Merging two DataFrames with pandas’ merge method
Sorting a DataFrame with pandas’ sort_values method
Aggregating a DataFrame with pandas’ groupby method
Using these tools to responsibly navigate and analyze California campaign data

Module 4: Hello Internet
This week you will learn how to log changes to your Jupyter Notebook with version-control software and publish your analysis on the Internet. It will cover:
The git version control software and its integration with GitHub’s social network
How data journalists use GitHub and Jupyter Notebook to publish their work
How to use the Markdown markup language to annotate a Jupyter Notebook
How to create a new git code repository and start tracking code
How to connect the repository to GitHub and publish a Jupyter Notebook