Statistical analysis of honeypot data and building of Kyoto 2006+ dataset for NIDS evaluation

  • Authors:
  • Jungsuk Song;Hiroki Takakura;Yasuo Okabe;Masashi Eto;Daisuke Inoue;Koji Nakao

  • Affiliations:
  • National Institute of Information and Communications Technology (NICT);Naogya University;Kyoto University;National Institute of Information and Communications Technology (NICT);National Institute of Information and Communications Technology (NICT);National Institute of Information and Communications Technology (NICT)

  • Venue:
  • Proceedings of the First Workshop on Building Analysis Datasets and Gathering Experience Returns for Security
  • Year:
  • 2011

Quantified Score

Hi-index 0.00

Visualization

Abstract

With the rapid evolution and proliferation of botnets, large-scale cyber attacks such as DDoS, spam emails are also becoming more and more dangerous and serious cyber threats. Because of this, network based security technologies such as Network based Intrusion Detection Systems (NIDSs), Intrusion Prevention Systems (IPSs), firewalls have received remarkable attention to defend our crucial computer systems, networks and sensitive information from attackers on the Internet. In particular, there has been much effort towards high-performance NIDSs based on data mining and machine learning techniques. However, there is a fatal problem in that the existing evaluation dataset, called KDD Cup 99' dataset, cannot reflect current network situations and the latest attack trends. This is because it was generated by simulation over a virtual network more than 10 years ago. To the best of our knowledge, there is no alternative evaluation dataset. In this paper, we present a new evaluation dataset, called Kyoto 2006+, built on the 3 years of real traffic data (Nov. 2006 ~ Aug. 2009) which are obtained from diverse types of honeypots. Kyoto 2006+ dataset will greatly contribute to IDS researchers in obtaining more practical, useful and accurate evaluation results. Furthermore, we provide detailed analysis results of honeypot data and share our experiences so that security researchers are able to get insights into the trends of latest cyber attacks and the Internet situations.