From throw-away traffic to bots: detecting the rise of DGA-based malware

Authors:
Manos Antonakakis;Roberto Perdisci;Yacin Nadji;Nikolaos Vasiloglou;Saeed Abu-Nimeh;Wenke Lee;David Dagon
Affiliations:
Damballa Inc., and Georgia Institute of Technology;University of Georgia and Georgia Institute of Technology;Georgia Institute of Technology;Damballa Inc.,;Damballa Inc.,;Georgia Institute of Technology;Georgia Institute of Technology
Venue:
Security'12 Proceedings of the 21st USENIX conference on Security symposium
Year:
2012

Citing 18
Cited 4

A tutorial on hidden Markov models and selected applications in speech recognition

Readings in speech recognition
The Alternating Decision Tree Learning Algorithm

ICML '99 Proceedings of the Sixteenth International Conference on Machine Learning
X-means: Extending K-means with Efficient Estimation of the Number of Clusters

ICML '00 Proceedings of the Seventeenth International Conference on Machine Learning
Combining Pattern Classifiers: Methods and Algorithms

Combining Pattern Classifiers: Methods and Algorithms
Text Mining Handbook: Advanced Approaches in Analyzing Unstructured Data

Text Mining Handbook: Advanced Approaches in Analyzing Unstructured Data
BotHunter: detecting malware infection through IDS-driven dialog correlation

SS'07 Proceedings of 16th USENIX Security Symposium on USENIX Security Symposium
Traffic Aggregation for Malware Detection

DIMVA '08 Proceedings of the 5th international conference on Detection of Intrusions and Malware, and Vulnerability Assessment
BotMiner: clustering analysis of network traffic for protocol- and structure-independent botnet detection

SS'08 Proceedings of the 17th conference on Security symposium
Inside risks: Reflections on Conficker

Communications of the ACM - A View of Parallel Computing
Your botnet is my botnet: analysis of a botnet takeover

Proceedings of the 16th ACM conference on Computer and communications security
Networks: An Introduction

Networks: An Introduction
Are Your Hosts Trading or Plotting? Telling P2P File-Sharing and Bots Apart

ICDCS '10 Proceedings of the 2010 IEEE 30th International Conference on Distributed Computing Systems
DNS prefetching and its privacy implications: when good things go bad

LEET'10 Proceedings of the 3rd USENIX conference on Large-scale exploits and emergent threats: botnets, spyware, worms, and more
Detecting algorithmically generated malicious domain names

IMC '10 Proceedings of the 10th ACM SIGCOMM conference on Internet measurement
Building a dynamic reputation system for DNS

USENIX Security'10 Proceedings of the 19th USENIX conference on Security
Malware Analyst's Cookbook and DVD: Tools and Techniques for Fighting Malicious Code

Malware Analyst's Cookbook and DVD: Tools and Techniques for Fighting Malicious Code
Detecting malware domains at the upper DNS hierarchy

SEC'11 Proceedings of the 20th USENIX conference on Security
Detecting stealthy P2P botnets using statistical traffic fingerprints

DSN '11 Proceedings of the 2011 IEEE/IFIP 41st International Conference on Dependable Systems&Networks

An empirical reexamination of global DNS behavior

Proceedings of the ACM SIGCOMM 2013 conference on SIGCOMM
Understanding the domain registration behavior of spammers

Proceedings of the 2013 conference on Internet measurement conference
Beehive: large-scale log analysis for detecting suspicious activity in enterprise networks

Proceedings of the 29th Annual Computer Security Applications Conference
ExecScent: mining for new C&C domains in live networks with adaptive control protocol templates

SEC'13 Proceedings of the 22nd USENIX conference on Security

Quantified Score

Hi-index	0.00

Visualization

Abstract

Many botnet detection systems employ a blacklist of known command and control (C&C) domains to detect bots and block their traffic. Similar to signature-based virus detection, such a botnet detection approach is static because the blacklist is updated only after running an external (and often manual) process of domain discovery. As a response, botmasters have begun employing domain generation algorithms (DGAs) to dynamically produce a large number of random domain names and select a small subset for actual C&C use. That is, a C&C domain is randomly generated and used for a very short period of time, thus rendering detection approaches that rely on static domain lists ineffective. Naturally, if we know how a domain generation algorithm works, we can generate the domains ahead of time and still identify and block bot-net C&C traffic. The existing solutions are largely based on reverse engineering of the bot malware executables, which is not always feasible. In this paper we present a new technique to detect randomly generated domains without reversing. Our insight is that most of the DGA-generated (random) domains that a bot queries would result in Non-Existent Domain (NXDomain) responses, and that bots from the same bot-net (with the same DGA algorithm) would generate similar NXDomain traffic. Our approach uses a combination of clustering and classification algorithms. The clustering algorithm clusters domains based on the similarity in the make-ups of domain names as well as the groups of machines that queried these domains. The classification algorithm is used to assign the generated clusters to models of known DGAs. If a cluster cannot be assigned to a known model, then a new model is produced, indicating a new DGA variant or family. We implemented a prototype system and evaluated it on real-world DNS traffic obtained from large ISPs in North America. We report the discovery of twelve DGAs. Half of them are variants of known (botnet) DGAs, and the other half are brand new DGAs that have never been reported before.