Plagiarism detection in software using efficient string matching

Authors:
Kusum Lata Pandey;Suneeta Agarwal;Sanjay Misra;Rajesh Prasad
Affiliations:
Ewing Christian College, Allahabad, India;Motilal Nehru National Institute of Technology, Allahabad, India;Atilim University, Ankara, Turkey;Ajay Kumar Garg Engineering College, Ghaziabad, India
Venue:
ICCSA'12 Proceedings of the 12th international conference on Computational Science and Its Applications - Volume Part IV
Year:
2012

Citing 9
Cited 0

Generalized string matching

SIAM Journal on Computing
A very fast substring search algorithm

Communications of the ACM
Software metrics and plagiarism detection

Journal of Systems and Software - Special issue on using software metrics
Turning the Boyer-Moore-Horspool string searching algorithm

Software—Practice & Experience
Parameterized pattern matching by Boyer-Moore-type algorithms

Proceedings of the sixth annual ACM-SIAM symposium on Discrete algorithms
A fast string searching algorithm

Communications of the ACM
Handbook of Exact String Matching Algorithms

Handbook of Exact String Matching Algorithms
Fast parameterized matching with q-grams

Journal of Discrete Algorithms
Parameterized matching on non-linear structures

Information Processing Letters

Quantified Score

Hi-index	0.00

Visualization

Abstract

String matching refers to the problem of finding occurrence(s) of a pattern string within another string or body of a text. It plays a vital role in plagiarism detectionin software codes, where it is required to identify similar program in a large populations. String matching has been used as a tool in a software metrics, which is used to measure the quality of software development process. In the recent years, many algorithms exist for solving the string matching problem. Among them, Berry---Ravindran algorithm was found to be fairly efficient. Further refinement of this algorithm is made in TVSBS and SSABS algorithms. However, these algorithms do not give the best possible shift in the search phase. In this paper, we propose an algorithm which gives the best possible shift in the search phase and is faster than the previously known algorithms. This algorithm behaves like Berry-Ravindran in the worst case. Further extension of this algorithm has been made for parameterized string matching which is able to detect plagiarism in a software code.