Copy detection mechanisms for digital documents
SIGMOD '95 Proceedings of the 1995 ACM SIGMOD international conference on Management of data
Algorithms on strings, trees, and sequences: computer science and computational biology
Algorithms on strings, trees, and sequences: computer science and computational biology
CHECK: a document plagiarism detection system
SAC '97 Proceedings of the 1997 ACM symposium on Applied computing
Reducing the space requirement of suffix trees
Software—Practice & Experience
Document overlap detection system for distributed digital libraries
DL '00 Proceedings of the fifth ACM conference on Digital libraries
Signature extraction for overlap detection in documents
ACSC '02 Proceedings of the twenty-fifth Australasian conference on Computer science - Volume 4
Suffix vector: space- and time-efficient alternative to suffix trees
ACSC '02 Proceedings of the twenty-fifth Australasian conference on Computer science - Volume 4
Methods for identifying versioned and plagiarized documents
Journal of the American Society for Information Science and Technology
Plaggie: GNU-licensed source code plagiarism detection engine for Java exercises
Proceedings of the 6th Baltic Sea conference on Computing education research: Koli Calling 2006
Journal of the American Society for Information Science and Technology
Déjà vu—A study of duplicate citations in Medline
Bioinformatics
Plagiarism Detection Using the Levenshtein Distance and Smith-Waterman Algorithm
ICICIC '08 Proceedings of the 2008 3rd International Conference on Innovative Computing Information and Control
Space efficient linear time construction of suffix arrays
CPM'03 Proceedings of the 14th annual conference on Combinatorial pattern matching
Proceedings of the 21st ACM conference on Hypertext and hypermedia
Language Resources and Evaluation
Comparative evaluation of text- and citation-based plagiarism detection approaches using guttenplag
Proceedings of the 11th annual international ACM/IEEE joint conference on Digital libraries
PPChecker: plagiarism pattern checker in document copy detection
TSD'06 Proceedings of the 9th international conference on Text, Speech and Dialogue
Intrinsic plagiarism detection
ECIR'06 Proceedings of the 28th European conference on Advances in Information Retrieval
Demonstration of citation pattern analysis for plagiarism detection
Proceedings of the 36th international ACM SIGIR conference on Research and development in information retrieval
Hi-index | 0.00 |
Plagiarism Detection Systems have been developed to locate instances of plagiarism e.g. within scientific papers. Studies have shown that the existing approaches deliver reasonable results in identifying copy&paste plagiarism, but fail to detect more sophisticated forms such as paraphrased plagiarism, translation plagiarism or idea plagiarism. The authors of this paper demonstrated in recent studies that the detection rate can be significantly improved by not only relying on text analysis, but by additionally analyzing the citations of a document. Citations are valuable language independent markers that are similar to a fingerprint. In fact, our examinations of real world cases have shown that the order of citations in a document often remains similar even if the text has been strongly paraphrased or translated in order to disguise plagiarism. This paper introduces three algorithms and discusses their suitability for the purpose of citation-based plagiarism detection. Due to the numerous ways in which plagiarism can occur, these algorithms need to be versatile. They must be capable of detecting transpositions, scaling and combinations in a local and global form. The algorithms are coined Greedy Citation Tiling, Citation Chunking and Longest Common Citation Sequence. The evaluation showed that if these algorithms are combined, common forms of plagiarism can be detected reliably.