PDE4Java: Plagiarism Detection Engine for Java source code: a clustering approach

  • Authors:
  • Ameera Jadalla;Ashraf Elnagar

  • Affiliations:
  • Department of Computer Science, College of Arts and Science, University of Sharjah, 27272 Sharjah, UAE.;Department of Computer Science, College of Arts and Science, University of Sharjah, 27272 Sharjah, UAE

  • Venue:
  • International Journal of Business Intelligence and Data Mining
  • Year:
  • 2008

Quantified Score

Hi-index 0.00

Visualization

Abstract

The educational community across the world is facing the increasing problem of plagiarism. The proposed Plagiarism Detection Engine for Java (PDE4Java) detects code-plagiarism by applying data mining techniques. The engine consists of three main phases; Java tokenisation, similarity measurement and clustering. It has an optional default tokeniser that makes it flexible to be used with almost any programming language. The system provides a visualising representation for each cluster besides the textual representation. The simulation results of PDE4Java showed a comparable performance to that of JPlag and it outperformed the expectations when compared to the domain experts' findings.