A detecting and tracing algorithm for unauthorized internet-news plagiarism using spatio-temporal document evolution model

  • Authors:
  • Chang-Keon Ryu;Hyong-Jun Kim;Hwan-Gue Cho

  • Affiliations:
  • Pusan National University, Republic of Korea;Pusan National University, Republic of Korea;Pusan National University, Republic of Korea

  • Venue:
  • Proceedings of the 2009 ACM symposium on Applied Computing
  • Year:
  • 2009

Quantified Score

Hi-index 0.00

Visualization

Abstract

Prevailing plagiarism among digital documents is one of the serious problems associated with the development of the Internet. It is not difficult to find news articles which have been copied without authorization. Internationally, newspapers may deal with the same event, and want to publish the article as soon as possible. So, a few Internet-based news media plagiarize news articles of other companies, rather than writing their own articles, by sending a reporter or relying on syndication. This kind of news article plagiarism boosts the number of internet news articles, which are all mostly identical, due to text copying and pasting. And, these kinds of plagiarized articles prevent Internet searching tools from collecting original articles. For example, it is known that more than 10--20% of articles collected by portal sites are nearly identical or quite similar. In order to deal with this problem we need a strict and stable tool for detecting internet news plagiarism. Also, if possible, it is highly desirable to perform a trace back to reconstruct the history of plagiarism. In this paper, we suggest a new detection algorithm for unauthorized news article plagiarism using a Spatio-Temporal evolution model. Using this model, we can reveal the sequence of plagiarism among news articles.