Measuring similarity of windows applications using static and dynamic birthmarks

Authors:
Dongjin Kim;Yongman Han;Seong-je Cho;Haeyoung Yoo;Jinwoon Woo;Yunmook Nah;Minkyu Park;Lawrence Chung
Affiliations:
Dankook University, Yongin, Korea;Dankook University, Yongin, Korea;Dankook University, Yongin, Korea;Dankook University, Yongin, Korea;Dankook University, Yongin, Korea;Dankook University, Yongin, Korea;Konkuk University, Chungbuk, Korea;University of Texas at Dallas, Texas
Venue:
Proceedings of the 28th Annual ACM Symposium on Applied Computing
Year:
2013

Citing 10
Cited 0

Winnowing: local algorithms for document fingerprinting

Proceedings of the 2003 ACM SIGMOD international conference on Management of data
Semantic similarity between search engine queries using temporal correlation

WWW '05 Proceedings of the 14th international conference on World Wide Web
K-gram based software birthmarks

Proceedings of the 2005 ACM symposium on Applied computing
Evaluating similarity measures: a large-scale study in the orkut social network

Proceedings of the eleventh ACM SIGKDD international conference on Knowledge discovery in data mining
Taxonomy generation for text segments: A practical web-based approach

ACM Transactions on Information Systems (TOIS)
A web-based kernel function for measuring the similarity of short text snippets

Proceedings of the 15th international conference on World Wide Web
Software theft detection through program identification

Software theft detection through program identification
A Combined Static and Dynamic Software Birthmark Based on Component Dependence Graph

IIH-MSP '08 Proceedings of the 2008 International Conference on Intelligent Information Hiding and Multimedia Signal Processing
Detecting Software Theft via System Call Based Birthmarks

ACSAC '09 Proceedings of the 2009 Annual Computer Security Applications Conference
A static birthmark of binary executables based on API call structure

ASIAN'07 Proceedings of the 12th Asian computing science conference on Advances in computer science: computer and network security

Quantified Score

Hi-index	0.00

Visualization

Abstract

A software birthmark is unique, as certain native characteristics of a program, hence can be used to measure the similarity between programs. In general, a static software birthmark does not need program execution, but is more vulnerable to attacks by semantic-preserving transformations. A dynamic software birthmark is applicable to packed executables, but cannot cover all the possible program paths. In this paper, we propose a novel effective technique to measure the similarity of Microsoft Windows applications using both static and dynamic birthmarks, which are based on the list of system APIs as well as the frequency of system API calls. Because system APIs are located in Windows system directories and act as a bridge between applications and the operating system, our birthmarks are resilient to obfuscations and compiler optimizations. A static birthmark consists of the system API call frequency of a target program, which can be extracted by scanning the executable file. A dynamic birthmark is the frequency of system API function calls, which can be extracted by a binary instrumentation tool during the execution of the program. To evaluate the effectiveness of the proposed technique, we compare various types of Windows applications using both the static and dynamic birthmarks. To demonstrate the robustness, we compare packed executables that were compressed by a binary packing tool. We carry out additional experiments for measuring the similarity between target Windows applications at the source code level and verify the evaluation results. The experimental results show that our birthmarks can effectively measure the similarity between Windows applications, as intended.