AISTATS Poster DiffRed: Dimensionality reduction guided by stable rank

Poster

DiffRed: Dimensionality reduction guided by stable rank

· Gagan Gupta · Kunal Dutta

MR1 & MR2 - Number 32

[ Abstract ]

[ Slides] [ Poster]

Abstract: In this work, we propose a novel dimensionality reduction technique, \textit{DiffRed}, which first projects the data matrix, A, along first

k_{1}

$k_1$ principal components and the residual matrix

A^{*}

$A^{*}$ (left after subtracting its

k_{1}

$k_1$ -rank approximation) along

k_{2}

$k_2$ Gaussian random vectors. We evaluate \emph{M1}, the distortion of mean-squared pair-wise distance, and \emph{Stress}, the normalized value of RMS of distortion of the pairwise distances. We rigorously prove that \textit{DiffRed} achieves a general upper bound of

O (\sqrt{\frac{1 - p}{k_{2}}})

$O\left(\sqrt{\frac{1-p}{k_2}}\right)$ on \emph{Stress} and

O (\frac{1 - p}{\sqrt{k_{2} * ρ (A^{*})}})

$O\left(\frac{1-p}{\sqrt{k_2*\rho(A^{*})}}\right)$ on \emph{M1} where

p

$p$ is the fraction of variance explained by the first

k_{1}

$k_1$ principal components and

ρ (A^{*})

$\rho(A^{*})$ is the \textit{stable rank} of

A^{*}

$A^{*}$ .These bounds are tighter than the currently known results for Random maps. Our extensive experiments on a variety of real-world datasets demonstrate that \textit{DiffRed} achieves near zero \emph{M1} and much lower values of \emph{Stress} as compared to the well-known dimensionality reduction techniques. In particular, \textit{DiffRed} can map a 6 million dimensional dataset to 10 dimensions with 54\% lower \emph{Stress} than PCA.

Chat is not available.