AISTATS Poster Achieving Fairness through Separability: A Unified Framework for Fair Representation Learning

Poster

Achieving Fairness through Separability: A Unified Framework for Fair Representation Learning

Taeuk Jang · Hongchang Gao · Pengyi Shi · Xiaoqian Wang

MR1 & MR2 - Number 70

[ Abstract ]

Abstract:

Fairness is a growing concern in machine learning as state-of-the-art models may amplify social prejudice by making biased predictions against specific demographics such as race and gender. Such discrimination raises issues in various fields such as employment, criminal justice, and trust score evaluation. To address the concerns, we propose learning fair representation through a straightforward yet effective approach to project intrinsic information while filtering sensitive information for downstream tasks. Our model consists of two goals: one is to ensure that the latent data from different demographic groups is non-separable (i.e., make the latent data distribution independent of the sensitive feature to improve fairness); the other is to maximize the separability of latent data from different classes (i.e., maintain the discriminative power of data for the sake of the downstream tasks like classification). Our method adopts a non-zero-sum adversarial game to minimize the distance between data from different demographic groups while maximizing the margin between data from different classes. Moreover, the proposed objective function can be easily generalized to multiple sensitive attributes and multi-class scenarios as it upper bounds popular fairness metrics in these cases. We provide theoretical analysis of the fairness of our model and validate w.r.t.\ both fairness and predictive performance on benchmark datasets.

Chat is not available.