A tutorial on hidden Markov models and selected applications in speech recognition
Readings in speech recognition
The Alternating Decision Tree Learning Algorithm
ICML '99 Proceedings of the Sixteenth International Conference on Machine Learning
X-means: Extending K-means with Efficient Estimation of the Number of Clusters
ICML '00 Proceedings of the Seventeenth International Conference on Machine Learning
Combining Pattern Classifiers: Methods and Algorithms
Combining Pattern Classifiers: Methods and Algorithms
Text Mining Handbook: Advanced Approaches in Analyzing Unstructured Data
Text Mining Handbook: Advanced Approaches in Analyzing Unstructured Data
BotHunter: detecting malware infection through IDS-driven dialog correlation
SS'07 Proceedings of 16th USENIX Security Symposium on USENIX Security Symposium
Traffic Aggregation for Malware Detection
DIMVA '08 Proceedings of the 5th international conference on Detection of Intrusions and Malware, and Vulnerability Assessment
SS'08 Proceedings of the 17th conference on Security symposium
Inside risks: Reflections on Conficker
Communications of the ACM - A View of Parallel Computing
Your botnet is my botnet: analysis of a botnet takeover
Proceedings of the 16th ACM conference on Computer and communications security
Networks: An Introduction
Are Your Hosts Trading or Plotting? Telling P2P File-Sharing and Bots Apart
ICDCS '10 Proceedings of the 2010 IEEE 30th International Conference on Distributed Computing Systems
DNS prefetching and its privacy implications: when good things go bad
LEET'10 Proceedings of the 3rd USENIX conference on Large-scale exploits and emergent threats: botnets, spyware, worms, and more
Detecting algorithmically generated malicious domain names
IMC '10 Proceedings of the 10th ACM SIGCOMM conference on Internet measurement
Building a dynamic reputation system for DNS
USENIX Security'10 Proceedings of the 19th USENIX conference on Security
Malware Analyst's Cookbook and DVD: Tools and Techniques for Fighting Malicious Code
Malware Analyst's Cookbook and DVD: Tools and Techniques for Fighting Malicious Code
Detecting malware domains at the upper DNS hierarchy
SEC'11 Proceedings of the 20th USENIX conference on Security
Detecting stealthy P2P botnets using statistical traffic fingerprints
DSN '11 Proceedings of the 2011 IEEE/IFIP 41st International Conference on Dependable Systems&Networks
An empirical reexamination of global DNS behavior
Proceedings of the ACM SIGCOMM 2013 conference on SIGCOMM
Understanding the domain registration behavior of spammers
Proceedings of the 2013 conference on Internet measurement conference
Beehive: large-scale log analysis for detecting suspicious activity in enterprise networks
Proceedings of the 29th Annual Computer Security Applications Conference
ExecScent: mining for new C&C domains in live networks with adaptive control protocol templates
SEC'13 Proceedings of the 22nd USENIX conference on Security
Hi-index | 0.00 |
Many botnet detection systems employ a blacklist of known command and control (C&C) domains to detect bots and block their traffic. Similar to signature-based virus detection, such a botnet detection approach is static because the blacklist is updated only after running an external (and often manual) process of domain discovery. As a response, botmasters have begun employing domain generation algorithms (DGAs) to dynamically produce a large number of random domain names and select a small subset for actual C&C use. That is, a C&C domain is randomly generated and used for a very short period of time, thus rendering detection approaches that rely on static domain lists ineffective. Naturally, if we know how a domain generation algorithm works, we can generate the domains ahead of time and still identify and block bot-net C&C traffic. The existing solutions are largely based on reverse engineering of the bot malware executables, which is not always feasible. In this paper we present a new technique to detect randomly generated domains without reversing. Our insight is that most of the DGA-generated (random) domains that a bot queries would result in Non-Existent Domain (NXDomain) responses, and that bots from the same bot-net (with the same DGA algorithm) would generate similar NXDomain traffic. Our approach uses a combination of clustering and classification algorithms. The clustering algorithm clusters domains based on the similarity in the make-ups of domain names as well as the groups of machines that queried these domains. The classification algorithm is used to assign the generated clusters to models of known DGAs. If a cluster cannot be assigned to a known model, then a new model is produced, indicating a new DGA variant or family. We implemented a prototype system and evaluated it on real-world DNS traffic obtained from large ISPs in North America. We report the discovery of twelve DGAs. Half of them are variants of known (botnet) DGAs, and the other half are brand new DGAs that have never been reported before.