Software plagiarism detection via the static API call frequency birthmark

  • Authors:
  • Dong-Kyu Chae;Sang-Wook Kim;Jiwoon Ha;Sang-Chul Lee;Gyun Woo

  • Affiliations:
  • Hanyang University, Korea;Hanyang University, Korea;Hanyang University, Korea;Hanyang University, Korea;National University, Korea

  • Venue:
  • Proceedings of the 28th Annual ACM Symposium on Applied Computing
  • Year:
  • 2013

Quantified Score

Hi-index 0.00

Visualization

Abstract

In this paper, we propose a system for detecting software plagiarism using a birthmark. The birthmark is representative features of a program, which can be used to identify the program. We use a set of frequency of APIs used in a program as its birthmark. The proposed system consists of three components. First, it extracts the frequency of APIs employed in a program. Next, it generates the program birthmark using a set of frequency of APIs and weights to APIs to extract unique features of the program. Finally, it decides the plagiarism based on the cosine similarity between the birthmarks. Through extensive experiments, it was found that the proposed system can provide 97.2% of precision and 95.7% of recall in plagiarism detection.