Double-Oracle Deep Reinforcement Learning for Handling Exponential Action Space in Sequential Stackelberg Security Games

Loading...
Thumbnail Image

Date

2022-10-26

Authors

Tan, Czander

Journal Title

Journal ISSN

Volume Title

Publisher

University of Oregon

Abstract

Standard Stackelberg Security Games (SSGs) assume attackers to be myopic players that select only a single target based on the defender's strategy. In this paper, we consider sequential SSGs, in which attackers launch multiple attacks sequentially. With a sequence of events, however, the defender's action space grows exponentially with the number of time steps, making the problem computationally intractable. To handle this issue, this paper presents the following contributions. First, we use the Double Oracle algorithm to iteratively derive player strategies. Second, we use Advantage Actor-Critic models to approximate best response policies for both players. Lastly, we represent the defender action space compactly with marginal probabilities instead of enumerating all possible actions. Overall, our experiments show that the Double Oracle algorithm not only allows us to search through defender strategies effectively and efficiently, but also provides optimal solutions that outperform other models at scalable settings.

Description

Keywords

Citation