Explore the challenges and solutions in causal data integration through this 28-minute lecture by Brit Youngmann from MIT. Delve into the fundamental issues faced in empirical scientific discoveries, particularly in natural and social sciences, when conducting causal inference. Examine how data management problems can lead to false discoveries, focusing on two key issues: incomplete attribute sets and misidentification of relevant attributes for analysis. Learn about the critical reliance on domain knowledge, often represented as causal DAGs, and how its unavailability or incompleteness affects analysis. Discover the proposed Causal Data Integration (CDI) problem and the innovative techniques developed to address these challenges, including methods for integrating datasets with unobserved potential confounding variables and causal DAG summarization. Gain insights into how data management techniques can overcome obstacles in causal inference and contribute to more accurate scientific discoveries.
Overview
Syllabus
Causal Data Integration
Taught by
Simons Institute