Winnowing: local algorithms for document fingerprinting
Proceedings of the 2003 ACM SIGMOD international conference on Management of data
Semantic similarity between search engine queries using temporal correlation
WWW '05 Proceedings of the 14th international conference on World Wide Web
K-gram based software birthmarks
Proceedings of the 2005 ACM symposium on Applied computing
Evaluating similarity measures: a large-scale study in the orkut social network
Proceedings of the eleventh ACM SIGKDD international conference on Knowledge discovery in data mining
Taxonomy generation for text segments: A practical web-based approach
ACM Transactions on Information Systems (TOIS)
A web-based kernel function for measuring the similarity of short text snippets
Proceedings of the 15th international conference on World Wide Web
Software theft detection through program identification
Software theft detection through program identification
A Combined Static and Dynamic Software Birthmark Based on Component Dependence Graph
IIH-MSP '08 Proceedings of the 2008 International Conference on Intelligent Information Hiding and Multimedia Signal Processing
Detecting Software Theft via System Call Based Birthmarks
ACSAC '09 Proceedings of the 2009 Annual Computer Security Applications Conference
A static birthmark of binary executables based on API call structure
ASIAN'07 Proceedings of the 12th Asian computing science conference on Advances in computer science: computer and network security
Hi-index | 0.00 |
A software birthmark is unique, as certain native characteristics of a program, hence can be used to measure the similarity between programs. In general, a static software birthmark does not need program execution, but is more vulnerable to attacks by semantic-preserving transformations. A dynamic software birthmark is applicable to packed executables, but cannot cover all the possible program paths. In this paper, we propose a novel effective technique to measure the similarity of Microsoft Windows applications using both static and dynamic birthmarks, which are based on the list of system APIs as well as the frequency of system API calls. Because system APIs are located in Windows system directories and act as a bridge between applications and the operating system, our birthmarks are resilient to obfuscations and compiler optimizations. A static birthmark consists of the system API call frequency of a target program, which can be extracted by scanning the executable file. A dynamic birthmark is the frequency of system API function calls, which can be extracted by a binary instrumentation tool during the execution of the program. To evaluate the effectiveness of the proposed technique, we compare various types of Windows applications using both the static and dynamic birthmarks. To demonstrate the robustness, we compare packed executables that were compressed by a binary packing tool. We carry out additional experiments for measuring the similarity between target Windows applications at the source code level and verify the evaluation results. The experimental results show that our birthmarks can effectively measure the similarity between Windows applications, as intended.