In the Multiple Instance Learning (MIL) scenario, the training data consists of instances grouped into bags. Bag labels indicate whether each bag contains at least one positive instance, but instance labels are not observed. Recently, Haussmann et al (CVPR 2017) tackled the MIL instance label prediction task by introducing the Multiple Instance Learning Gaussian Process Logistic (MIL-GP-Logistic) model, an adaptation of the Gaussian Process Logistic Classification model that inherits its uncertainty quantification and flexibility. Notably, they provide a fast mean-field variational inference procedure. However, due to their choice of the logistic link, they do not maximize the ELBO objective directly, but rather a lower bound on it. This approximation, as we show, hurts predictive performance. In this work, we propose the Multiple Instance Learning Gaussian Process Probit (MIL-GP-Probit) model, an adaptation of the Gaussian Process Probit Classification model to solve the MIL instance label prediction problem. Leveraging the analytical tractability of the probit link, we give a variational inference procedure based on variable augmentation that maximizes the ELBO objective directly. Applying it, we show MIL-GP-Probit is significantly more calibrated than MIL-GP-Logistic on all 20 datasets of the benchmark 20 Newsgroups dataset collection, and achieves higher AUC than MIL-GP-Logistic on an additional 51 out of 59 datasets. Furthermore, we show how the probit formulation enables principled bag label predictions and a Gibbs sampling scheme. This is the first exact posterior inference procedure for any Bayesian model for the MIL scenario.