Existing work on trustworthy machine learning (ML) often focuses on a single aspect of trust in ML (e.g., fairness, or privacy) and thus fails to obtain a holistic trust assessment. Furthermore, most techniques often fail to recognize that the parties who train models are not the same as the ones who assess their trustworthiness. We propose a framework that formulates trustworthy ML as a multi-objective multi-agent optimization problem to address these limitations. A holistic characterization of trust in ML naturally lends itself to a game theoretic formulation, which we call regulation games. We introduce and study a particular game instance, the SpecGame, which models the relationship between an ML model builder and regulators seeking to specify and enforce fairness and privacy regulations. Seeking socially optimal (i.e., efficient for all agents) solutions to the game, we introduce ParetoPlay. This novel equilibrium search algorithm ensures that agents remain on the Pareto frontier of their objectives and avoids the inefficiencies of other equilibria. For instance, we show that for a gender classification application, the achieved privacy guarantee is 3.76× worse than the ordained privacy requirement if regulators do not take the initiative to specify their desired guarantees first. We hope that our framework can provide policy guidance.
NeurIPS-Workshop (NeurIPS-W)
2023-10-28
2024-12-02