SimPaD: A word-similarity sentence-based plagiarism detection tool on Web documents
Web Intelligence and Agent Systems
Obfuscating plagiarism detection: vulnerabilities and solutions
Proceedings of the 12th International Conference on Computer Systems and Technologies
Proceedings of the 11th ACM symposium on Document engineering
Journal of the American Society for Information Science and Technology
A study on improved similarity measure algorithm for text-based document
FGIT'12 Proceedings of the 4th international conference on Future Generation Information Technology
Hi-index | 0.00 |
Plagiarism in texts is issues of increasing concern to the academic community. Now most common text plagiarism occurs by making a variety of minor alterations that include the insertion, deletion, or substitution of words. Such simple changes, however, require excessive string comparisons. In this paper, we present a hybrid plagiarism detection method. We investigate the use of a diagonal line, which is derived from Levenshtein distance, and simplified SmithWaterman algorithm that is a classical tool in the identification and quantification of local similarities in biological sequences, with a view to the application in the plagiarism detection. Our approach avoids globally involved string comparisons and considers psychological factors, which can yield significant speed-up by experiment results. Based on the results, we indicate the practicality of such improvement using Levenshtein distance and Smith-Waterman algorithm and to illustrate the efficiency gains. In the future, it would be interesting to explore appropriate heuristics in the area of text comparison