Plagiarism detection among source codes using adaptive local alignment of keywords

  • Authors:
  • Jin-Su Lim;Jeong-Hoon Ji;Hwan-Gue Cho;Gyun Woo

  • Affiliations:
  • Pusan National University, Guemjeong-Gu Busan, Republic of Korea;Korean Intellectual Property Office, Seo-Gu Daejeon, Republic of Korea;Pusan National University, Guemjeong-Gu Busan, Republic of Korea;Pusan National University, Guemjeong-Gu Busan, Republic of Korea

  • Venue:
  • Proceedings of the 5th International Conference on Ubiquitous Information Management and Communication
  • Year:
  • 2011

Quantified Score

Hi-index 0.00

Visualization

Abstract

This paper proposes a new method for detecting plagiarized pairs of source codes among a large set of source codes. The typical algorithms for detecting code plagiarism, which are largely exploited up to now, are based on Greedy-String Tiling or on local alignments of the two strings. This paper introduces a variant of the local alignment, namely, the adaptive local alignment, which exploits an adaptive similarity matrix. Each entry of the adaptive similarity matrix is the logarithm of the probabilities of the keywords based on the frequencies in a given set of programs. We experimented with this method using a set of programs submitted to more than 10 real programming contests. According to the experimental results, the distribution of the adaptive local alignment is more sensitive than that of the previous local alignments that used a fixed similarity matrix (+1 for match, −1 for mismatch, and −2 for gap), and the performance of the adaptive local alignment is superior to Greedy-String Tiling for detecting various plagiarism cases.