Model inversion (MI) attacks have raised increasing concerns about privacy, which can reconstruct training data from public models. Indeed, MI attacks can be formalized as an optimization problem that seeks private data in a certain space. Recent MI attacks leverage a generative adversarial network (GAN) as image prior to narrow the search space, and can successfully reconstruct even the high-dimensional data (e.g., face images). However, these GAN-based MI attacks do not fully exploit the potential capabilities of the target model, still leading to a vague and coupled search space. i.e., different classes of images are coupled in the search space. Besides, the widely used cross-entropy loss in these attacks suffers from gradient vanishing. To address these problems, we propose Pseudo Label-Guided MI (PLG-MI) attack via conditional GAN (cGAN). At first, a top-\emph{n} selection strategy is proposed to provide pseudo-labels for public data, and use pseudo-labels to guide the training of cGAN. In this way, the search space is decoupled for different classes of images. Then a max-margin loss is introduced to improve the search process on the subspace of a targeted class. Extensive experiments demonstrate that our PLG-MI attack significantly improves the attack success rate and visual quality for various datasets and models, notably, $2\sim3 \times$ better that state-of-the-art attacks under strong distributional shifts.
National Conference of the American Association for Artificial Intelligence (AAAI)
2023-02-07
2024-11-14