A detecting and tracing algorithm for unauthorized internet-news plagiarism using spatio-temporal document evolution model

Authors:
Chang-Keon Ryu;Hyong-Jun Kim;Hwan-Gue Cho
Affiliations:
Pusan National University, Republic of Korea;Pusan National University, Republic of Korea;Pusan National University, Republic of Korea
Venue:
Proceedings of the 2009 ACM symposium on Applied Computing
Year:
2009

Citing 5
Cited 1

CHECK: a document plagiarism detection system

SAC '97 Proceedings of the 1997 ACM symposium on Applied computing
Tool support for plagiarism detection in text documents

Proceedings of the 2005 ACM symposium on Applied computing
A probabilistic model for retrospective news event detection

Proceedings of the 28th annual international ACM SIGIR conference on Research and development in information retrieval
Drawing the Line between Fair Use and Plagiarism for Digital Documents

ENC '07 Proceedings of the Eighth Mexican International Conference on Current Trends in Computer Science
Mining spam email to identify common origins for forensic application

Proceedings of the 2008 ACM symposium on Applied computing

Who's the thief? automatic detection of the direction of plagiarism

CICLing'10 Proceedings of the 11th international conference on Computational Linguistics and Intelligent Text Processing

Quantified Score

Hi-index	0.00

Visualization

Abstract

Prevailing plagiarism among digital documents is one of the serious problems associated with the development of the Internet. It is not difficult to find news articles which have been copied without authorization. Internationally, newspapers may deal with the same event, and want to publish the article as soon as possible. So, a few Internet-based news media plagiarize news articles of other companies, rather than writing their own articles, by sending a reporter or relying on syndication. This kind of news article plagiarism boosts the number of internet news articles, which are all mostly identical, due to text copying and pasting. And, these kinds of plagiarized articles prevent Internet searching tools from collecting original articles. For example, it is known that more than 10--20% of articles collected by portal sites are nearly identical or quite similar. In order to deal with this problem we need a strict and stable tool for detecting internet news plagiarism. Also, if possible, it is highly desirable to perform a trace back to reconstruct the history of plagiarism. In this paper, we suggest a new detection algorithm for unauthorized news article plagiarism using a Spatio-Temporal evolution model. Using this model, we can reveal the sequence of plagiarism among news articles.