On Automatic Plagiarism Detection Based on n-Grams Comparison

Authors:
Alberto Barrón-Cedeño;Paolo Rosso
Affiliations:
Natural Language Engineering Lab. Dpto. Sistemas Informáticos y Computación, Universidad Politécnica de Valencia, Spain;Natural Language Engineering Lab. Dpto. Sistemas Informáticos y Computación, Universidad Politécnica de Valencia, Spain
Venue:
ECIR '09 Proceedings of the 31th European Conference on IR Research on Advances in Information Retrieval
Year:
2009

Citing 1
Cited 10

PPChecker: plagiarism pattern checker in document copy detection

TSD'06 Proceedings of the 9th international conference on Text, Speech and Dialogue

A coarse-to-fine framework to efficiently thwart plagiarism

Pattern Recognition
A new approach for cross-language plagiarism analysis

CLEF'10 Proceedings of the 2010 international conference on Multilingual and multimodal information access evaluation: cross-language evaluation forum
Automatic detection of local reuse

EC-TEL'10 Proceedings of the 5th European conference on Technology enhanced learning conference on Sustaining TEL: from innovation to learning and practice
Towards document plagiarism detection based on the relevance and fragmentation of the reused text

MICAI'10 Proceedings of the 9th Mexican international conference on Advances in artificial intelligence: Part I
SimPaD: A word-similarity sentence-based plagiarism detection tool on Web documents

Web Intelligence and Agent Systems
Plagiarism detection based on structural information

Proceedings of the 20th ACM international conference on Information and knowledge management
Using structural information and citation evidence to detect significant plagiarism cases in scientific publications

Journal of the American Society for Information Science and Technology
Online plagiarism detection through exploiting lexical, syntactic, and semantic information

ACL '12 Proceedings of the ACL 2012 System Demonstrations
Determining and characterizing the reused text for plagiarism detection

Expert Systems with Applications: An International Journal
Technical Section: EXOD: A tool for building and exploring a large graph of open datasets

Computers and Graphics

Quantified Score

Hi-index	0.00

Visualization

Abstract

When automatic plagiarism detection is carried out considering a reference corpus, a suspicious text is compared to a set of original documents in order to relate the plagiarised text fragments to their potential source. One of the biggest difficulties in this task is to locate plagiarised fragments that have been modified (by rewording, insertion or deletion, for example) from the source text. The definition of proper text chunks as comparison units of the suspicious and original texts is crucial for the success of this kind of applications. Our experiments with the METER corpus show that the best results are obtained when considering low level word n -grams comparisons (n = {2,3}).