Introduction to Reinforcement Learning
Introduction to Reinforcement Learning
Query-flood DoS attacks in gnutella
Proceedings of the 9th ACM conference on Computer and communications security
A Multi-Agent Policy-Gradient Approach to Network Routing
ICML '01 Proceedings of the Eighteenth International Conference on Machine Learning
Learning to Cooperate via Policy Search
UAI '00 Proceedings of the 16th Conference on Uncertainty in Artificial Intelligence
Coordinated Reinforcement Learning
ICML '02 Proceedings of the Nineteenth International Conference on Machine Learning
A Distributed Reinforcement Learning Scheme for Network Routing
A Distributed Reinforcement Learning Scheme for Network Routing
Infinite-horizon policy-gradient estimation
Journal of Artificial Intelligence Research
Proceedings of the 3rd ACM workshop on Artificial intelligence and security
Adaptive navigation for autonomous robots
Robotics and Autonomous Systems
Hi-index | 0.00 |
Proactive assessment of computer-network vulnerability to unknown future attacks is an important but unsolved computer security problem where AI techniques have significant impact potential. In this paper, we investigate the use of reinforcement learning (RL) for proactive security in the context of denial-of-service (DoS) attacks in peer-to-peer (P2P) networks. Such a tool would be useful for network administrators and designers to assess and compare the vulnerability of various network configurations and security measures in order to optimize those choices for maximum security. We first discuss the various dimensions of the problem and how to formulate it as RL. Next we introduce compact parametric policy representations for both single attacker and botnets and derive a policy-gradient RL algorithm. We evaluate these algorithms under a variety of network configurations that employ recent fair-use DoS security mechanisms. The results show that nur RL-based approach is able to significantly outperform a number of heuristic strategies in terms of the severity of the attacks discovered. The results also suggest some possible network design lessons for reducing the attack potential of an intelligent attacker.