AISTATS Poster Manifold-Aligned Counterfactual Explanations for Neural Networks

Poster

Manifold-Aligned Counterfactual Explanations for Neural Networks

Asterios Tsiourvas · Wei Sun · Georgia Perakis

MR1 & MR2 - Number 104

[ Abstract ]

[ Poster]

Abstract:

We study the problem of finding optimal manifold-aligned counterfactual explanations for neural networks. Existing approaches that involve solving a complex mixed-integer optimization (MIP) problem frequently suffer from scalability issues, limiting their practical usefulness. Furthermore, the solutions are not guaranteed to follow the data manifold, resulting in unrealistic counterfactual explanations. To address these challenges, we first present a MIP formulation where we explicitly enforce manifold alignment by reformulating the highly nonlinear Local Outlier Factor (LOF) metric as mixed-integer constraints. To address the computational challenge, we leverage the geometry of a trained neural network and propose an efficient decomposition scheme that reduces the initial large, hard-to-solve optimization problem into a series of significantly smaller, easier-to-solve problems by constraining the search space to “live” polytopes, i.e., regions that contain at least one actual data point. Experiments on real-world datasets demonstrate the efficacy of our approach in producing both optimal and realistic counterfactual explanations, and computational traceability.

Chat is not available.