Software plagiarism detection: a graph-based approach

  • Authors:
  • Dong-Kyu Chae;Jiwoon Ha;Sang-Wook Kim;BooJoong Kang;Eul Gyu Im

  • Affiliations:
  • Hanyang University, Seoul, South Korea;Hanyang University, Seoul, South Korea;Hanyang University, Seoul, South Korea;Hanyang University, Seoul, South Korea;Hanyang University, Seoul, South Korea

  • Venue:
  • Proceedings of the 22nd ACM international conference on Conference on information & knowledge management
  • Year:
  • 2013

Quantified Score

Hi-index 0.00

Visualization

Abstract

As plagiarism of software increases rapidly, there are growing needs for software plagiarism detection systems. In this paper, we propose a software plagiarism detection system using an API-labeled control flow graph (A-CFG) that abstracts the functionalities of a program. The A-CFG can reflect both the sequence and the frequency of APIs, while previous work rarely considers both of them together. To perform a scalable comparison of a pair of A-CFGs, we use random walk with restart (RWR) that computes an importance score for each node in a graph. By the RWR, we can generate a single score vector for an A-CFG and can also compare A-CFGs by comparing their score vectors. Extensive evaluations on a set of Windows applications demonstrate the effectiveness and the scalability of our proposed system compared with existing methods.