AISTATS Poster Sharpened Lazy Incremental Quasi-Newton Method

Poster

Sharpened Lazy Incremental Quasi-Newton Method

Aakash Sunil Lahoti · Spandan Senapati · Ketan Rajawat · Alec Koppel

MR1 & MR2 - Number 140

[ Abstract ]

[ Slides] [ Poster]

Abstract: The problem of minimizing the sum of

n

$n$ functions in

$d$ dimensions is ubiquitous in machine learning and statistics.In many applications where the number of observations

$n$ is large, it is necessary to use incremental or stochastic methods, as their per-iteration cost is independent of

$n$ .Of these, Quasi-Newton (QN) methods strike a balance between the per-iteration cost and the convergence rate.Specifically, they exhibit a superlinear rate with

$O(d^2)$ cost in contrast to the linear rate of first-order methods with

$O(d)$ cost and the quadratic rate of second-order methods with

$O(d^3)$ cost. However, existing incremental methods have notable shortcomings: Incremental Quasi-Newton (IQN) only exhibits asymptotic superlinear convergence.In contrast, Incremental Greedy BFGS (IGS) offers explicit superlinear convergence but suffers from poor empirical performance and has a per-iteration cost of

$O(d^3)$ .To address these issues, we introduce the Sharpened Lazy Incremental Quasi-Newton Method (SLIQN) that achieves the best of both worlds: an explicit superlinear convergence rate, and superior empirical performance at a per-iteration

$O(d^2)$ cost.SLIQN features two key changes: first, it incorporates a hybrid strategy of using both classic and greedy BFGS updates, allowing it to empirically outperform both IQN and IGS.Second, it employs a clever constant multiplicative factor along with a lazy propagation strategy, which enables it to have a cost of

$O(d^2)$ .Additionally, our experiments demonstrate the superiority of SLIQN over other incremental and stochastic Quasi-Newton variants and establish its competitiveness with second-order incremental methods.

Chat is not available.