AISTATS Poster Learning High-dimensional Gaussians from Censored Data

Poster

Learning High-dimensional Gaussians from Censored Data

Debmalya Mandal · Stefanie Jegelka · Themis Gouleakis · Yuhao Wang

[ Abstract ]

Abstract: We provide efficient algorithms for the problem of distribution learning from high-dimensional Gaussian data where in each sample, some of the variable values are missing. We suppose that the variables are {\em missing not at random (MNAR)}.The missingness model, denoted by

$\mathbb{S}(\mathbf{y})$ , is the function that maps any point

$\mathbf{y}\in \mathbb{R}^d$ to the subsets of its coordinates that are seen. In this work, we assume that it is known. We study the following two settings:- [**Self-censoring**] An observation

$\mathbf{x}$ is generated by first sampling the true value

$\mathbf{y}$ from a

$d$ -dimensional Gaussian

$\mathcal{N}(\mathbf{\mu}^*, \Sigma^*)$ with unknown

$\mathbf{\mu}^*$ and

$\Sigma^*$ . For each coordinate

$i$ , there exists a set

$S_i\subseteq \mathbb{R}^d$ such that

$x_i=y_i$ if and only if

$y_i\in S_i$ . Otherwise,

$x_i$ is missing and takes a generic value (e.g

?"). We design an algorithm that learns

$\mathcal{N}(\mathbf{\mu}^*, \Sigma^*)$ up to TV distance

$\varepsilon$ , using

$\textup{poly}(d, 1/\varepsilon)$ samples, assuming only that each pair of coordinates is observed with sufficiently high probability. - [**Linear thresholding**] An observation

$\mathbf{x}$ is generated by first sampling

$\mathbf{y}$ from a

$d$ -dimensional Gaussian

$\mathcal{N}(\mathbf{\mu}^*, \Sigma)$ with unknown

$\mathbf{\mu}^*$ and known

$\Sigma$ , and then applying the missingness model

$\mathbb{S}$ where

$\mathbb{S}(\mathbf{y}) = \{i \in [d]: \mathbf{v}_i^T \mathbf{y} \leq b_i\}$ for some

$\mathbf{v}_1, \dots, \mathbf{v}_d \in \mathbb{R}^d$ and

$b_1, \dots, b_d \in \mathbb{R}$ . We design an efficient mean estimation algorithm, assuming that none of the possible missingness patterns is very rare conditioned on the values of the observed coordinates and that any small subset of coordinates is observed with sufficiently high probability.

Live content is unavailable. Log in and register to view live content