Abstract:
Machine learning models can suffer from performance degradation when applied to new tasks due to distribution shifts. Feature representation learning offers a robust solution to this issue. However, a fundamental challenge remains in devising the optimal strategy for feature selection. Existing literature is somewhat paradoxical: some advocate for learning invariant features from source domains, while others favor more diverse features. For better understanding, we propose a statistical framework that evaluates the utilities of the features (\ie how differently the features are used in each source task) based on the variance of their correlation to across different domains. Under our framework, we design and analyze a learning procedure consisting of learning content features (comprising both invariant and approximately shared features) from source tasks and fine-tuning them on the target task. Our theoretical analysis highlights the significance of learning approximately shared features—beyond strictly invariant ones—when distribution shifts occur. Our analysis also yields an improved population risk on target tasks compared to previous results. Inspired by our theory, we introduce ProjectionNet, a practical method to distinguish content features from environmental features via \textit{explicit feature space control}, further consolidating our theoretical findings.
Live content is unavailable. Log in and register to view live content