Poster 131

On the Number of Conditional Independence Tests in Constraint-based Causal Discovery

Marc Franquesa Monés · Jiaqi Zhang · Caroline Uhler

Abstract

Learning causal relationships from observational data is a fundamental problem with wide-ranging applications across many fields. Constraint-based discovery methods infer the underlying causal structure by performing conditional independence tests using data. However, existing algorithms such as the PC algorithm need to perform a large number of independence tests, which in the worst case is exponential in the maximum degree of the causal graph. Despite extensive research, it remains unclear if there exist algorithms with better complexity without additional assumptions. Here, we establish an algorithm achieving a better complexity of $ \mathcal{O}(\exp(s)) $ tests, where $s$ is the maximum undirected clique size of the underlying essential graph. Complementing this, we prove that any constraint-based algorithm must perform at least $ \Omega(\exp(s)) $ conditional independence tests, establishing that our proposed algorithm achieves optimality in terms of number of conditional independence tests needed. Finally, we validate our theoretical findings through simulations, semi-synthetic gene-expression data, and real-world data, demonstrating the efficiency of our algorithm compared to existing methods.