Computer viruses: theory and experiments
Computers and Security
Algorithms for clustering data
Algorithms for clustering data
Data structures and algorithms for nearest neighbor search in general metric spaces
SODA '93 Proceedings of the fourth annual ACM-SIAM Symposium on Discrete algorithms
Evaluating Clone Detection Tools for Use during Preventative Maintenance
SCAM '02 Proceedings of the Second IEEE International Workshop on Source Code Analysis and Manipulation
Experiments with Clustering as a Software Remodularization Method
WCRE '99 Proceedings of the Sixth Working Conference on Reverse Engineering
Using Automatic Clustering to Produce High-Level System Organizations of Source Code
IWPC '98 Proceedings of the 6th International Workshop on Program Comprehension
The Second International Workshop on Detection of Software Clones: workshop report
ACM SIGSOFT Software Engineering Notes
Semantics-Aware Malware Detection
SP '05 Proceedings of the 2005 IEEE Symposium on Security and Privacy
Efficient plagiarism detection for large code repositories
Software—Practice & Experience
IMDS: intelligent malware detection system
Proceedings of the 13th ACM SIGKDD international conference on Knowledge discovery and data mining
Introduction to Information Retrieval
Introduction to Information Retrieval
Comparison and evaluation of code clone detection techniques and tools: A qualitative approach
Science of Computer Programming
Google Android: A Comprehensive Security Assessment
IEEE Security and Privacy
All your droid are belong to us: a survey of current android attacks
WOOT'11 Proceedings of the 5th USENIX conference on Offensive technologies
Malware classification based on call graph clustering
Journal in Computer Virology
Dissecting Android Malware: Characterization and Evolution
SP '12 Proceedings of the 2012 IEEE Symposium on Security and Privacy
Analysis of Malicious and Benign Android Applications
ICDCSW '12 Proceedings of the 2012 32nd International Conference on Distributed Computing Systems Workshops
MALWARE '11 Proceedings of the 2011 6th International Conference on Malicious and Unwanted Software
DroidMat: Android Malware Detection through Manifest and API Calls Tracing
ASIAJCIS '12 Proceedings of the 2012 Seventh Asia Joint Conference on Information Security
Fast, scalable detection of "Piggybacked" mobile applications
Proceedings of the third ACM conference on Data and application security and privacy
MAST: triage for market-scale mobile malware analysis
Proceedings of the sixth ACM conference on Security and privacy in wireless and mobile networks
TRUSTCOM '13 Proceedings of the 2013 12th IEEE International Conference on Trust, Security and Privacy in Computing and Communications
Hi-index | 0.00 |
We present an automated method for extracting familial signatures for Android malware, i.e., signatures that identify malware produced by piggybacking potentially different benign applications with the same (or similar) malicious code. The APK classes that constitute malware code in a repackaged application are separated from the benign code and the Android API calls used by the malicious modules are extracted to create a signature. A piggybacked malicious app can be detected by first decomposing it into loosely coupled modules and then matching the Android API calls called by each of the modules against the signatures of the known malware families. Since the signatures are based on Android API calls, they are related to the core malware behavior, and thus are more resilient to obfuscations. In triage, AV companies need to automatically classify large number of samples so as to optimize assignment of human analysts. They need a system that gives low false negatives even if it is at the cost of higher false positives. Keeping this goal in mind, we fine tuned our system and used standard 10 fold cross validation over a dataset of 1,052 malicious APKs and 48 benign APKs to verify our algorithm. Results show that we have 94% accuracy, 97% precision, and 93% recall when separating benign from malware. We successfully classified our entire malware dataset into 11 families with 98% accuracy, 87% precision, and 94% recall.