Keywords: [ Probabilistic Methods ] [ Graphical Models ]

[
Abstract
]

Abstract:

In this paper, we study high-dimensional estimation from truncated samples. We focus on two fundamental and classical problems: (i) inference of sparse Gaussian graphical models and (ii) support recovery of sparse linear models.

(i) For Gaussian graphical models, suppose d-dimensional samples x are generated from a Gaussian N(mu, Sigma) and observed only if they belong to a subset S of R^d. We show that mu and Sigma can be estimated with error epsilon in the Frobenius norm, using O~(nz(Sigma^{-1})/epsilon^2) samples from a truncated N(mu, Sigma) and having access to a membership oracle for S. The set S is assumed to have non-trivial measure under the unknown distribution but is otherwise arbitrary.

(ii) For sparse linear regression, suppose samples (x,y) are generated where y = *. Specifically, we show that under some mild assumptions, only O(k^2 log d) samples are needed to estimate Omega* in the infinity-norm up to a bounded error. Similar results are also estabilished for estimating Omega* in the Euclidean norm up to arbitrary error.

For both problems, our estimator minimizes the sum of the finite population negative log-likelihood function and an ell_1-regularization term.