Using Slicing to Identify Duplication in Source Code
SAS '01 Proceedings of the 8th International Symposium on Static Analysis
Building Documentation Generators
ICSM '99 Proceedings of the IEEE International Conference on Software Maintenance
The TXL source transformation language
Science of Computer Programming - The fourth workshop on language descriptions, tools, and applications (LDTA'04)
Comparison and Evaluation of Clone Detection Tools
IEEE Transactions on Software Engineering
Scalable detection of semantic clones
Proceedings of the 30th international conference on Software engineering
ICPC '08 Proceedings of the 2008 The 16th IEEE International Conference on Program Comprehension
Detection and analysis of drive-by-download attacks and malicious JavaScript code
Proceedings of the 19th international conference on World wide web
Prophiler: a fast filter for the large-scale detection of malicious web pages
Proceedings of the 20th international conference on World wide web
ZOZZLE: fast and precise in-browser JavaScript malware detection
SEC'11 Proceedings of the 20th USENIX conference on Security
ICPC '11 Proceedings of the 2011 IEEE 19th International Conference on Program Comprehension
Static detection of malicious JavaScript-bearing PDF documents
Proceedings of the 27th Annual Computer Security Applications Conference
Effective analysis, characterization, and detection of malicious web pages
Proceedings of the 22nd international conference on World Wide Web companion
Tuning research tools for scalability and performance: The NiCad experience
Science of Computer Programming
Hi-index | 0.00 |
One common vector of malware is JavaScript in Adobe Acrobat(PDF) files. In this paper, we investigate using near miss clone detectors to find the malware. We start by collecting a set of PDF files containing JavaScript malware and a set with clean JavaScript from the VirusTotal repository. We use the NiCad clone detector to find the classes of clones in a small subset of the malicious PDF files. We evaluate how clone classes can be used to find similar malicious files in the rest of the malicious collection while avoiding files in the benign collection. Our results show that a small training set produced 87% detection of previously known malware with 1% false positives.