Class Central is learner-supported. When you buy through links on our site, we may earn an affiliate commission.

University of Colorado Boulder

Data Collection and Integration

University of Colorado Boulder via Coursera

Overview

The "Data Collection and Integration" course provides students with comprehensive techniques for gathering data from diverse sources, including files, relational databases, web pages, and APIs. Participants will gain practical experience in collecting and integrating data for further processing and analysis. The course emphasizes the utilization of appropriate tools and packages, such as Pandas, Beautiful Soup, and SQL, to effectively handle real-life datasets and address data integration challenges.

Syllabus

  • Collect Data From Files
    • The "Collect Data from Files" week focuses on equipping you with the necessary skills to handle various file formats, such as txt, csv, json, xml, html, and more, for effective data collection. You will learn how to read, parse, and extract relevant data from different file types, enabling you to gather valuable information from diverse sources.
  • Collect Data From Web
    • The "Collect Data from Web" week focuses on empowering you with the skills to extract data from various webpage formats using Python libraries like requests and Beautiful Soup. You will learn how to access web pages, retrieve HTML content, and parse the data to collect relevant information effectively.
  • Collect Data From Database
    • The "Collect Data from Database" week focuses on equipping you with the skills to interact with various SQL-like databases using Python packages. You will learn how to connect to databases, execute queries, and retrieve data from different database systems, enabling you to collect and utilize data efficiently.
  • Collect Data From APIs
    • The "Collect Data from APIs" week focuses on enabling you to interact with various websites that provide Application Programming Interfaces (APIs). You will learn how to access APIs, retrieve data in structured formats (e.g., JSON or XML), and utilize Python to process and extract valuable information from API responses.
  • Data Integration
    • The "Data Integration" week focuses on the techniques and methodologies for integrating data collected from various sources. You will learn how to combine and merge datasets, handle data inconsistencies, and create a unified dataset for further analysis and decision-making.
  • Case Study
    • The "Case Study" week offers you the opportunity to apply the knowledge you have learned throughout the course in a practical and comprehensive case study. You will engage in data collection from various sources, including files, SQL-like databases, and web APIs, and then integrate the collected data into a unified dataset for further analysis. This week serves as a culminating activity, allowing you to demonstrate your skills in data collection, integration, and preparation for analysis.

Taught by

Di Wu

Reviews

Start your review of Data Collection and Integration

Never Stop Learning.

Get personalized course recommendations, track subjects and courses with reminders, and more.

Someone learning on their laptop while sitting on the floor.