The recent proxy-anchor method achieved outstanding performance in deep metric learning, which can be acknowledged to its data efficient loss based on hard example mining, as well as far lower sampling complexity than pair-based approaches. In this paper we extend the proxy-anchor method by posing it within the continual learning framework, motivated from its batch-expected loss form (instead of instance-expected, typical in deep learning), which can potentially incur the catastrophic forgetting of historic batches. By regarding each batch as a task in continual learning, we adopt the Bayesian variational continual learning approach to derive a novel loss function. Interestingly the resulting loss has two key modifications to the original proxy-anchor loss: i) we inject noise to the proxies when optimizing the proxy-anchor loss, and ii) we encourage momentum update to avoid abrupt model changes. As a result, the learned model achieves higher test accuracy than proxy-anchor due to the robustness to noise in data (through model perturbation during training), and the reduced batch forgetting effect. We demonstrate the improved results on several benchmark datasets.