Causal Data Integration

Prof. Brit Youngmann, Technion

Causal inference is fundamental to empirical scientific discoveries in natural and social sciences; however, in the process of conducting causal inference, data management problems can lead to false discoveries. Two such problems are (i) not having all attributes required for analysis, and (ii) misidentifying which attributes are to be included in the analysis. Analysts often only have access to partial data, and they critically rely on (often unavailable or incomplete) domain knowledge to identify attributes to include for analysis, which is often given in the form of a causal DAG. We argue that data management techniques can surmount both of these challenges. In this work, we introduce the Causal Data Integration (CDI) problem and discuss our proposed solution, which includes developing techniques to integrate input datasets with unobserved potential confounding variables, and causal DAG summarization.

Prof. Brit Youngmann

Brit Youngmann is an Assistant Professor at the Technion’s Faculty of Computer Science. Her research focuses on data management, causal reasoning, and responsible data management. She is developing automatized data tools to facilitate data analysis performed by scientists to accelerate scientific discoveries. Her research draws on ideas from data management and causal reasoning, making them practical for goal-oriented scientists working with real-life datasets. Before joining The Technnion, she was a postdoctoral researcher in the Data System Group at MIT, and received her Ph.D in Computer Science from Tel Aviv University.