Fast plagiarism detection by sentence hashing

  • Authors:
  • Dariusz Ceglarek;Konstanty Haniewicz

  • Affiliations:
  • Poznan School of Banking, Poland;Poznan University of Economics, Poland

  • Venue:
  • ICAISC'12 Proceedings of the 11th international conference on Artificial Intelligence and Soft Computing - Volume Part II
  • Year:
  • 2012

Quantified Score

Hi-index 0.00

Visualization

Abstract

This work presents a Sentence Hashing Algorithm for Plagiarism Detection - SHAPD. To present a user with the best results the algorithm makes use of special trait of the written texts - their natural sentence fragmentation, later employing a set of special techniques for text representation. Results obtained demonstrate that the algorithm delivers solution faster than the alternatives. Its algorithmic complexity is logarithmic, thus its performance is better than most algorithms using dynamic programming used to find the longest common subsequence.