Term-weighting approaches in automatic text retrieval
Information Processing and Management: an International Journal
The state of retrieval system evaluation
Information Processing and Management: an International Journal - Special issue on evaluation issues in information retrieval
Information Retrieval
Detection of Duplicate Defect Reports Using Natural Language Processing
ICSE '07 Proceedings of the 29th international conference on Software Engineering
Plagiarism detection across distant language pairs
COLING '10 Proceedings of the 23rd International Conference on Computational Linguistics
A new approach for cross-language plagiarism analysis
CLEF'10 Proceedings of the 2010 international conference on Multilingual and multimodal information access evaluation: cross-language evaluation forum
Cross-language plagiarism detection
Language Resources and Evaluation
SimPaD: A word-similarity sentence-based plagiarism detection tool on Web documents
Web Intelligence and Agent Systems
Searching for Translated Plagiarism with the Help of Desktop Grids
Journal of Grid Computing
Hi-index | 0.00 |
Multilingual text processing has been gaining more and more attention in recent years. This trend has been accentuated by the global integration of European states and the vanishing cultural and social boundaries. Multilingual text processing has become an important field bringing a lot of new and interesting problems. This paper describes a novel approach to multilingual plagiarism detection. We propose a new method called MLPlag for plagiarism detection in multilingual environment. This method is based on analysis of word positions. It utilizes the EuroWordNet thesaurus which transforms words into language independent form. This allows to identify documents plagiarized from sources written in other languages. Special techniques, such as semantic-based word normalization, were incorporated to refine our method. It identifies the replacement of synonyms used by plagiarists to hide the document match. We performed and evaluated our experiments on monolingual and multilingual corpora and results are presented in this paper.