Direct Loss Minimization for Sparse Gaussian Processes

Yadi Wei · Rishit Sheth · Roni Khardon

Keywords: [ Optimization ] [ Non-Convex Optimization ] [ Algorithms -> Missing Data; Algorithms ] [ Sparsity and Compressed Sensing ] [ Models and Methods ] [ Gaussian Processes ]

[ Abstract ]
Wed 14 Apr 12:45 p.m. PDT — 2:45 p.m. PDT


The paper provides a thorough investigation of Direct Loss Minimization (DLM), which optimizes the posterior to minimize predictive loss, in sparse Gaussian processes. For the conjugate case, we consider DLM for log-loss and DLM for square loss showing a significant performance improvement in both cases. The application of DLM in non-conjugate cases is more complex because the logarithm of expectation in the log-loss DLM objective is often intractable and simple sampling leads to biased estimates of gradients. The paper makes two technical contributions to address this. First, a new method using product sampling is proposed, which gives unbiased estimates of gradients (uPS) for the objective function. Second, a theoretical analysis of biased Monte Carlo estimates (bMC) shows that stochastic gradient descent converges despite the biased gradients. Experiments demonstrate empirical success of DLM. A comparison of the sampling methods shows that, while uPS is potentially more sample-efficient, bMC provides a better tradeoff in terms of convergence time and computational efficiency.

Chat is not available.