Inspired by work on Stackelberg security games, we introduce Stackelberg planning, where a leader player in a classical planning task chooses a minimum-cost action sequence aimed at maximizing the plan cost of a follower player in the same task. Such Stackelberg planning can provide useful analyses not only in planning-based security applications like network penetration testing, but also to measure robustness against perturbances in more traditional planning applications (eg with a leader sabotaging road network connections in transportation-type domains). To identify all equilibria–exhibiting the leader's own-cost-vs.-follower-cost tradeoff–we design leader-follower search, a state space search at the leader level which calls in each state an optimal planner at the follower level. We devise simple heuristic guidance, branch-and-bound style pruning, and partial-order reduction techniques for this setting. We run experiments on Stackelberg variants of IPC and pentesting benchmarks. In several domains, Stackelberg planning is quite feasible in practice.
The Thirty-Second AAAI Conference on Artificial Intelligence (AAAI-18)