Double-Oracle Deep Reinforcement Learning for Handling Exponential Action Space in Sequential Stackelberg Security Games

Nguyen, ThanhTan, Czander2022-10-262022-10-262022-10-26https://hdl.handle.net/1794/27730Standard Stackelberg Security Games (SSGs) assume attackers to be myopic players that select only a single target based on the defender's strategy. In this paper, we consider sequential SSGs, in which attackers launch multiple attacks sequentially. With a sequence of events, however, the defender's action space grows exponentially with the number of time steps, making the problem computationally intractable. To handle this issue, this paper presents the following contributions. First, we use the Double Oracle algorithm to iteratively derive player strategies. Second, we use Advantage Actor-Critic models to approximate best response policies for both players. Lastly, we represent the defender action space compactly with marginal probabilities instead of enumerating all possible actions. Overall, our experiments show that the Double Oracle algorithm not only allows us to search through defender strategies effectively and efficiently, but also provides optimal solutions that outperform other models at scalable settings.en-USAll Rights Reserved.Double-Oracle Deep Reinforcement Learning for Handling Exponential Action Space in Sequential Stackelberg Security GamesElectronic Thesis or Dissertation