Oral
Oral: Causality
Auditorium 1
We develop a novel approach towards causal inference. Rather than structural equations over a causal graph, we learn stochastic differential equations (SDEs) whose stationary densities model a system's behavior under interventions. These stationary diffusion models do not require the formalism of causal graphs, let alone the common assumption of acyclicity. We show that in several cases, they generalize to unseen interventions on their variables, often better than classical approaches. Our inference method is based on a new theoretical result that expresses a stationarity condition on the diffusion's generator in a reproducing kernel Hilbert space. The resulting kernel deviation from stationarity (KDS) is an objective function of independent interest.
General Identifiability and Achievability for Causal Representation Learning
Burak Varici · Emre Acartürk · Karthikeyan Shanmugam · Ali Tajer
This paper focuses on causal representation learning (CRL) under a general nonparametric latent causal model and a general transformation model that maps the latent data to the observational data. It establishes identifiability and achievability results using two hard uncoupled interventions per node in the latent causal graph. Notably, one does not know which pair of intervention environments have the same node intervened (hence, uncoupled). For identifiability, the paper establishes that perfect recovery of the latent causal model and variables is guaranteed under uncoupled interventions. For achievability, an algorithm is designed that uses observational and interventional data and recovers the latent causal model and variables with provable guarantees. This algorithm leverages score variations across different environments to estimate the inverse of the transformer and, subsequently, the latent variables. The analysis, additionally, recovers the identifiability result for two hard coupled interventions, that is when metadata about the pair of environments that have the same node intervened is known. This paper also shows that when observational data is available, additional faithfulness assumptions that are adopted by the existing literature are unnecessary.
Membership Testing in Markov Equivalence Classes via Independence Queries
Jiaqi Zhang · Kirankumar Shiragur · Caroline Uhler
Understanding causal relationships between variables is a fundamental problem with broad impact in numerous scientific fields. While extensive research has been dedicated to \emph{learning} causal graphs from data, its complementary concept of \emph{testing} causal relationships has remained largely unexplored. While \emph{learning} involves the task of recovering the Markov equivalence class (MEC) of the underlying causal graph from observational data, the \emph{testing} counterpart addresses the following critical question: \emph{Given a specific MEC and observational data from some causal graph, can we determine if the data-generating causal graph belongs to the given MEC?} We explore constraint-based testing methods by establishing bounds on the required number of conditional independence tests. Our bounds are in terms of the size of the maximum undirected clique ($s$) of the given MEC. In the worst case, we show a lower bound of $\exp(\Omega(s))$ independence tests. We then give an algorithm that resolves the task with $\exp(O(s))$ tests, matching our lower bound. Compared to the \emph{learning} problem, where algorithms often use a number of independence tests that is exponential in the maximum in-degree, this shows that \emph{testing} is relatively easier. In particular, it requires exponentially less independence tests in graphs featuring high in-degrees and small clique sizes. Additionally, using the DAG associahedron, we provide a geometric interpretation of testing versus learning and discuss how our testing result can aid learning.
On the Misspecification of Linear Assumptions in Synthetic Controls
Achille Nazaret · claudia shi · David Blei
The synthetic control (SC) method is popular for estimating causal effects from observational panel data. It rests on a crucial assumption that we can write the treated unit as a linear combination of the untreated units. In practice, this assumption may not hold, and when violated, the resulting SC estimates are incorrect. This paper examines two questions: (1) How large can the misspecification error be? (2) How can we minimize it? First, we provide theoretical bounds to quantify the misspecification error. The bounds are comforting: small misspecifications induce small errors. With these bounds in hand, we develop new SC estimators specially designed to minimize misspecification error. The estimators are based on additional data about each unit. (E.g., if the units are countries, it might be demographic information about each.) We study our estimators on synthetic data; we find they produce more accurate causal estimates than standard SC. We then re-analyze the California tobacco program data of the original SC paper, now including additional data from the US census about per-state demographics. Our estimators show that the observations in the pre-treatment period lie within the bounds of misspecification error and that observations post-treatment lie outside of those bounds. This is evidence that our SC methods have uncovered a true effect.