Sentence-based natural language plagiarism detection

Authors:
Daniel R. White;Mike S. Joy
Affiliations:
University of Warwick, Coventry, United Kingdom;University of Warwick, Coventry, United Kingdom
Venue:
Journal on Educational Resources in Computing (JERIC)
Year:
2004

Citing 8
Cited 13

Managing gigabytes (2nd ed.): compressing and indexing documents and images

Managing gigabytes (2nd ed.): compressing and indexing documents and images
Crisis on campus: confronting academic misconduct

Crisis on campus: confronting academic misconduct
Signature extraction for overlap detection in documents

ACSC '02 Proceedings of the twenty-fifth Australasian conference on Computer science - Volume 4
Comparison of Overlap Detection Techniques

ICCS '02 Proceedings of the International Conference on Computational Science-Part I
Methods for identifying versioned and plagiarized documents

Journal of the American Society for Information Science and Technology
Using Visualization to Detect Plagiarism in Computer Science Classes

INFOVIS '00 Proceedings of the IEEE Symposium on Information Vizualization 2000
Visualising Intra-Corpal Plagiarism

IV '01 Proceedings of the Fifth International Conference on Information Visualisation
Plagiarism in programming assignments

IEEE Transactions on Education

The boss online submission and assessment system

Journal on Educational Resources in Computing (JERIC)
Influence of gender, program of study and PC experience on unethical computer using behaviors of Turkish undergraduate students

Computers & Education
Text plagiarism detection method based on path patterns

International Journal of Business Intelligence and Data Mining
Measuring text similarity with dynamic time warping

IDEAS '08 Proceedings of the 2008 international symposium on Database engineering & applications
Enhancing learning management systems to better support computer science education

ACM SIGCSE Bulletin
Developing a corpus of plagiarised short answers

Language Resources and Evaluation
SimPaD: A word-similarity sentence-based plagiarism detection tool on Web documents

Web Intelligence and Agent Systems
An improved plagiarism detection scheme based on semantic role labeling

Applied Soft Computing
Using structural information and citation evidence to detect significant plagiarism cases in scientific publications

Journal of the American Society for Information Science and Technology
Visual comparison for information visualization

Information Visualization - Special issue on State of the Field and New Research Directions
Online plagiarism detection through exploiting lexical, syntactic, and semantic information

ACL '12 Proceedings of the ACL 2012 System Demonstrations
Early-Detection system for cross-language (translated) plagiarism

ICT-EurAsia'13 Proceedings of the 2013 international conference on Information and Communication Technology
An application for plagiarized source code detection based on a parse tree kernel

Engineering Applications of Artificial Intelligence

Quantified Score

Hi-index	0.00

Visualization

Abstract

With the increasing levels of access to higher education in the United Kingdom, larger class sizes make it unrealistic for tutors to be expected to identify instances of peer-to-peer plagiarism by eye and so automated solutions to the problem are required. This document details a novel algorithm for comparison of suspect documents at a sentence level and has been implemented as a component of plagiarism detection software for detecting similarities in both natural language documents and comments within program source-code. The algorithm is capable of detecting sophisticated obfuscation (such as paraphrasing, reordering, merging, and splitting sentences) as well as direct copying. The implemented algorithm has also been used to successfully detect plagiarism on real assignments at the university. The software has been evaluated by comparison with other plagiarism detection tools.