Most existing membership inference attacks (MIAs) utilize metrics (e.g., loss) calculated on the model's final state, while recent advanced attacks leverage metrics computed at various stages, including both intermediate and final stages, throughout the model training. Nevertheless, these attacks often process multiple intermediate states of the metric independently, ignoring their time-dependent patterns. Consequently, they struggle to effectively distinguish between members and non-members who exhibit similar metric values, particularly resulting in a high false-positive rate. In this study, we delve deeper into the new membership signals in the black-box scenario. We identify a new, more integrated membership signal: the Pattern of Metric Sequence, derived from the various stages of model training. We contend that current signals provide only partial perspectives of this new signal: the new one encompasses both the model's multiple intermediate and final states, with a greater emphasis on temporal patterns among them. Building upon this signal, we introduce a novel attack method called Sequential-metric based Membership Inference Attack (SeqMIA). Specifically, we utilize knowledge distillation to obtain a set of distilled models representing various stages of the target model's training. We then assess multiple metrics on these distilled models in chronological order, creating distilled metric sequence. We finally integrate distilled multi-metric sequences as a sequential multiformat and employ an attention-based RNN attack model for inference. Empirical results show SeqMIA outperforms all baselines, especially can achieve an order of magnitude improvement in terms of TPR @ 0.1% FPR. Furthermore, we delve into the reasons why this signal contributes to SeqMIA's high attack performance, and assess various defense mechanisms against SeqMIA.
ACM Conference on Computer and Communications Security (CCS)
2024-10-14
2024-10-08