Salting public traces with attack traffic to test flow classifiers

Authors:
Z. Berkay Celik;Jayaram Raghuram;George Kesidis;David J. Miller
Affiliations:
Department of Computer Science and Engineering, Pennsylvania State University, University Park, PA;Department of Electrical Engineering, Pennsylvania State University, University Park, PA;Department of Computer Science and Engineering, Pennsylvania State University, University Park, PA and Department of Electrical Engineering, Pennsylvania State University, University Park, PA;Department of Electrical Engineering, Pennsylvania State University, University Park, PA
Venue:
CSET'11 Proceedings of the 4th conference on Cyber security experimentation and test
Year:
2011

Citing 13
Cited 0

Passive estimation of TCP round-trip times

ACM SIGCOMM Computer Communication Review
A high-level programming environment for packet trace anonymization and transformation

Proceedings of the 2003 conference on Applications, technologies, architectures, and protocols for computer communications
On the correspondency between TCP acknowledgment packet and data packet

Proceedings of the 3rd ACM SIGCOMM conference on Internet measurement
Logistic Model Trees

Machine Learning
The devil and packet trace anonymization

ACM SIGCOMM Computer Communication Review
A first look at modern enterprise traffic

IMC '05 Proceedings of the 5th ACM SIGCOMM conference on Internet Measurement
BotHunter: detecting malware infection through IDS-driven dialog correlation

SS'07 Proceedings of 16th USENIX Security Symposium on USENIX Security Symposium
Early application identification

CoNEXT '06 Proceedings of the 2006 ACM CoNEXT conference
BotMiner: clustering analysis of network traffic for protocol- and structure-independent botnet detection

SS'08 Proceedings of the 17th conference on Security symposium
Efficient application identification and the temporal and spatial stability of classification schema

Computer Networks: The International Journal of Computer and Telecommunications Networking
The WEKA data mining software: an update

ACM SIGKDD Explorations Newsletter
New methods for passive estimation of TCP round-trip times

PAM'05 Proceedings of the 6th international conference on Passive and Active Network Measurement
An Overview of IP Flow-Based Intrusion Detection

IEEE Communications Surveys & Tutorials

Quantified Score

Hi-index	0.00

Visualization

Abstract

We consider the problem of using flow-level data for detection of botnet command and control (C&C) activity. We find that current approaches do not consider timing-based calibration of the C&C traffic traces prior to using this traffic to salt a background traffic trace. Thus, timing-based features of the C&C traffic may be artificially distinctive, potentially leading to (unrealistically) optimistic flow classification results. In this paper, we show that round-trip times (RTT) of the C&C traffic are significantly smaller than that of the background traffic. We present a method to calibrate the timing-based features of the simulated botnet traffic by estimating eligible RTT samples from the background traffic. We then salt C&C traffic, and design flow classifiers under four scenarios: with and without calibrating timing-based features of C&C traffic, without using timing-based features, and calibrating C&C traffic only in the test set. In the flow classifier, we strive to use features that are not readily susceptible to obfuscation or tampering such as port numbers or protocol-specific information in the payload header. We discuss the results for several supervised classifiers, evaluating botnet C&C traffic precision, recall, and overall classification accuracy. Our experiments reveal to what extent the presence of timing artifacts in botnet traces leads to changes in classifier results.