YAP3: improved detection of similarities in computer program and other texts
SIGCSE '96 Proceedings of the twenty-seventh SIGCSE technical symposium on Computer science education
CCFinder: a multilinguistic token-based code clone detection system for large scale source code
IEEE Transactions on Software Engineering
On finding duplication and near-duplication in large software systems
WCRE '95 Proceedings of the Second Working Conference on Reverse Engineering
Clone Detection Using Abstract Syntax Trees
ICSM '98 Proceedings of the International Conference on Software Maintenance
Winnowing: local algorithms for document fingerprinting
Proceedings of the 2003 ACM SIGMOD international conference on Management of data
CP-Miner: Finding Copy-Paste and Related Bugs in Large-Scale Software Code
IEEE Transactions on Software Engineering
GPLAG: detection of software plagiarism by program dependence graph analysis
Proceedings of the 12th ACM SIGKDD international conference on Knowledge discovery and data mining
Plagiarism detection across programming languages
ACSC '06 Proceedings of the 29th Australasian Computer Science Conference - Volume 48
Similarity and originality in code: plagiarism and normal variation in student assignments
ACE '06 Proceedings of the 8th Australasian Conference on Computing Education - Volume 52
Plagiarism detection using feature-based neural networks
Proceedings of the 38th SIGCSE technical symposium on Computer science education
Efficient plagiarism detection for large code repositories
Software—Practice & Experience
Comparison and Evaluation of Clone Detection Tools
IEEE Transactions on Software Engineering
Plaggie: GNU-licensed source code plagiarism detection engine for Java exercises
Proceedings of the 6th Baltic Sea conference on Computing education research: Koli Calling 2006
Scalable detection of semantic clones
Proceedings of the 30th international conference on Software engineering
Plagiarism in programming assignments
IEEE Transactions on Education
On Students' Strategy-Preferences for Managing Difficult Course Work
IEEE Transactions on Education
Hi-index | 0.00 |
In academic courses, students frequently take advantage of someone else’s work to improve their own evaluations or grades. This unethical behavior seriously threatens the integrity of the academic system, and teachers invest substantial effort in preventing and recognizing plagiarism. When students take examinations requiring the production of computer programs, plagiarism detection can be semiautomated using analysis techniques such as JPlag and Moss. These techniques are useful but lose effectiveness when the text of the exam suggests some of the elements that should be structurally part of the solution. A loss of effectiveness is caused by the many common parts that are shared between programs due to the suggestions in the text of the exam rather than plagiarism. In this article, we present the AuDeNTES anti-plagiarism technique. AuDeNTES detects plagiarism via the code fragments that better represent the individual students’ contributions by filtering from students’ submissions the parts that might be common to many students due to the suggestions in the text of the exam. The filtered parts are identified by comparing students’ submissions against a reference solution, which is a solution of the exam developed by the teachers. Specifically, AuDeNTES first produces tokenized versions of both the reference solution and the programs that must be analyzed. Then, AuDeNTES removes from the tokenized programs the tokens that are included in the tokenized reference solution. Finally, AuDeNTES computes the similarity among the filtered tokenized programs and produces a ranked list of program pairs suspected of plagiarism. An empirical comparison against multiple state-of-the-art plagiarism detection techniques using several sets of real students’ programs collected in early programming courses demonstrated that AuDeNTES identifies more plagiarism cases than the other techniques at the cost of a small additional inspection effort.