A co-classification framework for detecting web spam and spammers in social media web sites

  • Authors:
  • Feilong Chen;Pang-Ning Tan;Anil K. Jain

  • Affiliations:
  • Michigan State University, East Lansing, MI, USA;Michigan State University, East Lansing, MI, USA;Michigan State University, East Lansing, MI, USA

  • Venue:
  • Proceedings of the 18th ACM conference on Information and knowledge management
  • Year:
  • 2009

Quantified Score

Hi-index 0.00

Visualization

Abstract

Social media are becoming increasingly popular and have attracted considerable attention from spammers. Using a sample of more than ninety thousand known spam Web sites, we found between 7% to 18% of their URLs are posted on two popular social media Web sites, digg.com and delicious.com. In this paper, we present a co-classification framework to detect Web spam and the spammers who are responsible for posting them on the social media Web sites. The rationale for our approach is that since both detection tasks are related, it would be advantageous to train them simultaneously to make use of the labeled examples in the Web spam and spammer training data. We have evaluated the effectiveness of our algorithm on the delicious.com data set. Our experimental results showed that the proposed co-classification algorithm significantly outperforms classifiers that learn each detection task independently.