Double-Oracle Deep Reinforcement Learning for Handling Exponential Action Space in Sequential Stackelberg Security Games
Loading...
Date
2022-10-26
Authors
Tan, Czander
Journal Title
Journal ISSN
Volume Title
Publisher
University of Oregon
Abstract
Standard Stackelberg Security Games (SSGs) assume attackers to be myopic players that select only a single target based on the defender's strategy. In this paper, we consider sequential SSGs, in which attackers launch multiple attacks sequentially. With a sequence of events, however, the defender's action space grows exponentially with the number of time steps, making the problem computationally intractable.
To handle this issue, this paper presents the following contributions. First, we use the Double Oracle algorithm to iteratively derive player strategies. Second, we use Advantage Actor-Critic models to approximate best response policies for both players. Lastly, we represent the defender action space compactly with marginal probabilities instead of enumerating all possible actions.
Overall, our experiments show that the Double Oracle algorithm not only allows us to search through defender strategies effectively and efficiently, but also provides optimal solutions that outperform other models at scalable settings.