Principal Component Regression (PCR) is a popular method for
prediction from data, and is one way to address the so-called
multi-collinearity problem in regression. It was shown recently that
algorithms for PCR such as hard singular value thresholding (HSVT) are
also quite robust, in that they can handle data that has missing or
noisy covariates. However, such spectral approaches require strong
distributional assumptions on which entries are observed.
Specifically, every covariate is assumed to be observed with
probability (exactly) $p$, for some value of $p$. Our goal in this work is
to weaken this requirement, and as a step towards this, we study a
``semi-random'' model. In this model, every covariate is revealed with
probability $p$, and then an adversary comes in and reveals additional
covariates. While the model seems intuitively easier, it is well known
that algorithms such as HSVT perform poorly. Our approach is based on studying the closely related problem of Noisy Matrix Completion in a semi-random setting. By considering a new semidefinite programming relaxation, we develop new guarantees for matrix completion, which is our core technical contribution.