A vector space model for automatic indexing
Communications of the ACM
Principles of Program Analysis
Principles of Program Analysis
Mobile Malware Attacks and Defense
Mobile Malware Attacks and Defense
Malicious web content detection by machine learning
Expert Systems with Applications: An International Journal
Neighbor-weighted K-nearest neighbor for unbalanced text corpus
Expert Systems with Applications: An International Journal
Classification of malware using structured control flow
AusPDC '10 Proceedings of the Eighth Australasian Symposium on Parallel and Distributed Computing - Volume 107
Clustering based distributed phylogenetic tree construction
Expert Systems with Applications: An International Journal
A survey of mobile malware in the wild
Proceedings of the 1st ACM workshop on Security and privacy in smartphones and mobile devices
Review: Hybrid expert systems: A survey of current approaches and applications
Expert Systems with Applications: An International Journal
A survey on automated dynamic malware-analysis techniques and tools
ACM Computing Surveys (CSUR)
Android: Static Analysis Using Similarity Distance
HICSS '12 Proceedings of the 2012 45th Hawaii International Conference on System Sciences
Detecting repackaged smartphone applications in third-party android marketplaces
Proceedings of the second ACM conference on Data and Application Security and Privacy
Expert Systems with Applications: An International Journal
Review: SMS spam filtering: Methods and data
Expert Systems with Applications: An International Journal
Review: Data mining techniques and applications - A decade review from 2000 to 2011
Expert Systems with Applications: An International Journal
RiskRanker: scalable and accurate zero-day android malware detection
Proceedings of the 10th international conference on Mobile systems, applications, and services
Dissecting Android Malware: Characterization and Evolution
SP '12 Proceedings of the 2012 IEEE Symposium on Security and Privacy
Measuring user confidence in smartphone security and privacy
Proceedings of the Eighth Symposium on Usable Privacy and Security
CHEX: statically vetting Android apps for component hijacking vulnerabilities
Proceedings of the 2012 ACM conference on Computer and communications security
Using probabilistic generative models for ranking risks of Android apps
Proceedings of the 2012 ACM conference on Computer and communications security
Proceedings of the third ACM conference on Data and application security and privacy
Expert Systems with Applications: An International Journal
Juxtapp: a scalable system for detecting code reuse among android applications
DIMVA'12 Proceedings of the 9th international conference on Detection of Intrusions and Malware, and Vulnerability Assessment
DroidChameleon: evaluating Android anti-malware against transformation attacks
Proceedings of the 8th ACM SIGSAC symposium on Information, computer and communications security
Mining frequent patterns and association rules using similarities
Expert Systems with Applications: An International Journal
Hi-index | 12.05 |
The rapid proliferation of smartphones over the last few years has come hand in hand with and impressive growth in the number and sophistication of malicious apps targetting smartphone users. The availability of reuse-oriented development methodologies and automated malware production tools makes exceedingly easy to produce new specimens. As a result, market operators and malware analysts are increasingly overwhelmed by the amount of newly discovered samples that must be analyzed. This situation has stimulated research in intelligent instruments to automate parts of the malware analysis process. In this paper, we introduce Dendroid, a system based on text mining and information retrieval techniques for this task. Our approach is motivated by a statistical analysis of the code structures found in a dataset of Android OS malware families, which reveals some parallelisms with classical problems in those domains. We then adapt the standard Vector Space Model and reformulate the modelling process followed in text mining applications. This enables us to measure similarity between malware samples, which is then used to automatically classify them into families. We also investigate the application of hierarchical clustering over the feature vectors obtained for each malware family. The resulting dendograms resemble the so-called phylogenetic trees for biological species, allowing us to conjecture about evolutionary relationships among families. Our experimental results suggest that the approach is remarkably accurate and deals efficiently with large databases of malware instances.