Signature extraction for overlap detection in documents
ACSC '02 Proceedings of the twenty-fifth Australasian conference on Computer science - Volume 4
Methods for identifying versioned and plagiarized documents
Journal of the American Society for Information Science and Technology
Intrinsic plagiarism detection
ECIR'06 Proceedings of the 28th European conference on Advances in Information Retrieval
Hi-index | 0.00 |
Existing researches on text plagiarism detection mainly focus on external plagiarism detection which assumes a reference collection is given and the plagiarism detection task aims to compare suspicious documents against this collection to find the plagiarism articles with high similarity. The results of existing studies have performed well in identifying external plagiarized sections. However, in the real world, the reference collection is impossible to get. This paper focuses on this case and proposes an intrinsic plagiarism detection framework with supervised machine learning approach. The instance creation and the feature selection method are presented in detail. The experimental results on PAN'09 corpus demonstrate the effectiveness of our approach to intrinsic plagiarism.