Skip to yearly menu bar Skip to main content


Poster

Is Merging Worth It? Securely Evaluating the Information Gain for Causal Dataset Acquisition

Antoine Moulin · Alex Buna · Danqi Liao · Patrick Rebeschini


Abstract:

Merging datasets across institutions is a lengthy and costly procedure, especially when it involves private information. Data hosts may therefore want to prospectively gauge which datasets are most beneficial to merge with, without revealing sensitive information. For causal estimation this is particularly challenging as the value of a merge depends not only on reduction in epistemic uncertainty but also on improvement in overlap. To address this challenge, we introduce the first \emph{cryptographically secure} information-theoretic approach for quantifying the value of a merge in the context of heterogeneous treatment effect estimation. We do this by evaluating the \emph{Expected Information Gain} (EIG) using multi-party computation to ensure that no raw data is revealed. We further demonstrate that our approach can be combined with differential privacy (DP) to meet arbitrary privacy requirements whilst preserving more accurate computation compared to DP alone. To the best of our knowledge, this work presents the first privacy-preserving method for dataset acquisition tailored to causal estimation.Code is publicly available: \url{https://github.com/LucileTerminassian/causalprospectivemerge}.

Live content is unavailable. Log in and register to view live content