Multilingual Plagiarism Detection

Authors:
Zdenek Ceska;Michal Toman;Karel Jezek
Affiliations:
Department of Computer Science and Engineering, Faculty of Applied Sciences, University of West Bohemia, Pilsen, Czech Republic 306 14;Department of Computer Science and Engineering, Faculty of Applied Sciences, University of West Bohemia, Pilsen, Czech Republic 306 14;Department of Computer Science and Engineering, Faculty of Applied Sciences, University of West Bohemia, Pilsen, Czech Republic 306 14
Venue:
AIMSA '08 Proceedings of the 13th international conference on Artificial Intelligence: Methodology, Systems, and Applications
Year:
2008

Citing 4
Cited 5

Term-weighting approaches in automatic text retrieval

Information Processing and Management: an International Journal
The state of retrieval system evaluation

Information Processing and Management: an International Journal - Special issue on evaluation issues in information retrieval
Information Retrieval

Information Retrieval
Detection of Duplicate Defect Reports Using Natural Language Processing

ICSE '07 Proceedings of the 29th international conference on Software Engineering

Plagiarism detection across distant language pairs

COLING '10 Proceedings of the 23rd International Conference on Computational Linguistics
A new approach for cross-language plagiarism analysis

CLEF'10 Proceedings of the 2010 international conference on Multilingual and multimodal information access evaluation: cross-language evaluation forum
Cross-language plagiarism detection

Language Resources and Evaluation
SimPaD: A word-similarity sentence-based plagiarism detection tool on Web documents

Web Intelligence and Agent Systems
Searching for Translated Plagiarism with the Help of Desktop Grids

Journal of Grid Computing

Quantified Score

Hi-index	0.00

Visualization

Abstract

Multilingual text processing has been gaining more and more attention in recent years. This trend has been accentuated by the global integration of European states and the vanishing cultural and social boundaries. Multilingual text processing has become an important field bringing a lot of new and interesting problems. This paper describes a novel approach to multilingual plagiarism detection. We propose a new method called MLPlag for plagiarism detection in multilingual environment. This method is based on analysis of word positions. It utilizes the EuroWordNet thesaurus which transforms words into language independent form. This allows to identify documents plagiarized from sources written in other languages. Special techniques, such as semantic-based word normalization, were incorporated to refine our method. It identifies the replacement of synonyms used by plagiarists to hide the document match. We performed and evaluated our experiments on monolingual and multilingual corpora and results are presented in this paper.