Skip to yearly menu bar Skip to main content


Benchmarking Observational Studies with Experimental Data under Right-Censoring

Ilker Demirel · Edward De Brouwer · Zeshan Hussain · Michael Oberst · Anthony Philippakis · David Sontag

MR1 & MR2 - Number 42
[ ]
Sat 4 May 6 a.m. PDT — 8:30 a.m. PDT


Drawing causal inferences from observational studies (OS) requires unverifiable validity assumptions; however, one can falsify those assumptions by benchmarking the OS with experimental data from a randomized controlled trial (RCT). A major limitation of existing procedures is not accounting for censoring, despite the abundance of RCTs and OSes that report right-censored time-to-event outcomes. We consider two cases where censoring time (1) is independent of time-to-event and (2) depends on time-to-event the same way in OS and RCT. For the former, we adopt a censoring-doubly-robust signal for the conditional average treatment effect (CATE) to facilitate an equivalence test of CATEs in OS and RCT, which serves as a proxy for testing if the validity assumptions hold. For the latter, we show that the same test can still be used even though unbiased CATE estimation may not be possible. We verify the effectiveness of our censoring-aware tests via semi-synthetic experiments and analyze RCT and OS data from the Women's Health Initiative study.

Live content is unavailable. Log in and register to view live content