Skip to yearly menu bar Skip to main content


Poster

Joint Selection: Adaptively Incorporating Public Information for Private Synthetic Data

Miguel Fuentes · Brett Mullins · Ryan McKenna · Gerome Miklau · Daniel Sheldon

MR1 & MR2 - Number 99
[ ]
Thu 2 May 8 a.m. PDT — 8:30 a.m. PDT
 
Oral presentation: Oral: Trustworthy ML
Sat 4 May 5 a.m. PDT — 6 a.m. PDT

Abstract:

Mechanisms for generating differentially private synthetic data based on marginals and graphical models have been successful in a wide range of settings. However, one limitation of these methods is their inability to incorporate public data. Initializing a data generating model by pre-training on public data has shown to improve the quality of synthetic data, but this technique is not applicable when model structure is not determined a priori. We develop the mechanism JAM-PGM, which expands the adaptive measurements framework to jointly select between measuring public data and private data. This technique allows for public data to be included in a graphical-model-based mechanism. We show that JAM-PGM is able to outperform both publicly assisted and non publicly assisted synthetic data generation mechanisms even when the public data distribution is biased.

Chat is not available.