Redesign and implementation of evaluation dataset for intrusion detection system

Authors:
Jun Qian;Chao Xu;Meilin Shi
Affiliations:
Department of Computer Science, Tsinghua University, Beijing, P.R. China;Department of Computer Science, Tsinghua University, Beijing, P.R. China;Department of Computer Science, Tsinghua University, Beijing, P.R. China
Venue:
ETRICS'06 Proceedings of the 2006 international conference on Emerging Trends in Information and Communication Security
Year:
2006

Citing 8
Cited 0

Survey of the state of the art in human language technology

Survey of the state of the art in human language technology
The 1999 DARPA off-line intrusion detection evaluation

Computer Networks: The International Journal of Computer and Telecommunications Networking - Special issue on recent advances in intrusion detection systems
A framework for constructing features and models for intrusion detection systems

ACM Transactions on Information and System Security (TISSEC)
Testing Intrusion detection systems: a critique of the 1998 and 1999 DARPA intrusion detection system evaluations as performed by Lincoln Laboratory

ACM Transactions on Information and System Security (TISSEC)
Specification-based anomaly detection: a new approach for detecting network intrusions

Proceedings of the 9th ACM conference on Computer and communications security
A Software Platform for Testing Intrusion Detection Systems

IEEE Software
Learning nonstationary models of normal network traffic for detecting novel attacks

Proceedings of the eighth ACM SIGKDD international conference on Knowledge discovery and data mining
An Intrusion Alert Correlator Based on Prerequisites of Intrusions

An Intrusion Alert Correlator Based on Prerequisites of Intrusions

Quantified Score

Hi-index	0.00

Visualization

Abstract

Although the intrusion detection system industry is rapidly maturing, the state of intrusion detection system evaluation is not. The off-line dataset evaluation proposed by MIT Lincoln Lab is a practical solution in terms of evaluating the performance of IDS. While the evaluation dataset represents a significant and monumental undertaking, there remain several issues unsolved in the design and modeling of the resulting dataset which may make the evaluation results biased. Some researchers have noticed such problems and criticized the design and execution of the dataset, but there is no technical contribution for new efforts proposed per se. In this paper we present our efforts to redesign and generate new dataset. We first study how network applications and user behaviors characterize the network traffic. Second, we apply ourselves to improve on the background traffic simulation (including HTTP, SMTP, POP, P2P, FTP and other types of traffic). Unlike the existing model, our model simulates traffic from user level rather than from packet level, which is more reasonable for background traffic modeling and simulation. Our model takes advantage of user-level web mining, automatic user profiling and Enron email dataset etc. The high fidelity of simulated background traffic is shown in experiment. Moreover, different kinds of attacker personalities are profiled and more than 300 instances of 62 different automated attacks are launched against victim hosts and servers. All our efforts try to make the dataset more “real” and therefore be fairer for IDS evaluation.