Consistent Complementary-Label Learning via Order-Preserving Losses
Abstract
In contrast to ordinary supervised classification tasks that require a vast number of data with high-quality labels, complementary-label learning (CLL) deals with the weakly-supervised learning scenario where each instance is equipped with a complementary label, which specifies a class the example does not belong to. However, existing statistically consistent CLL approaches usually suffer from overfitting intrinsically. Although there exist other overfitting-resistant CLL approaches, they can only work with limited losses or lacks statistical guarantees. In this paper, we aim to propose overfitting-resistant and theoretically sound approaches for CLL. Considering the unique property of the distribution of complementarily labeled samples, we provide a risk estimator via order-preserving losses, which are naturally non-negative and thus can avoid overfitting caused by negative terms in risk estimators. Moreover, we provide classifier-consistency analysis and statistical guarantee for this estimator. Furthermore, we provide a reweighed version of the proposed risk estimator to further enhance its generalization ability and prove its statistical consistency. Experiments on benchmark datasets demonstrate the efficiency of our proposed methods.