A co-classification framework for detecting web spam and spammers in social media web sites

Authors:
Feilong Chen;Pang-Ning Tan;Anil K. Jain
Affiliations:
Michigan State University, East Lansing, MI, USA;Michigan State University, East Lansing, MI, USA;Michigan State University, East Lansing, MI, USA
Venue:
Proceedings of the 18th ACM conference on Information and knowledge management
Year:
2009

Citing 7
Cited 5

Linear prediction models with graph regularization for web-page categorization

Proceedings of the 12th ACM SIGKDD international conference on Knowledge discovery and data mining
Detecting splogs via temporal dynamics using self-similarity analysis

ACM Transactions on the Web (TWEB)
Opinion spam and analysis

WSDM '08 Proceedings of the 2008 International Conference on Web Search and Data Mining
Extracting spam blogs with co-citation clusters

Proceedings of the 17th international conference on World Wide Web
Combating spam in tagging systems: An evaluation

ACM Transactions on the Web (TWEB)
Web spam identification through content and hyperlinks

AIRWeb '08 Proceedings of the 4th international workshop on Adversarial information retrieval on the web
Link Mining for a Social Bookmarking Web Site

WI-IAT '08 Proceedings of the 2008 IEEE/WIC/ACM International Conference on Web Intelligence and Intelligent Agent Technology - Volume 01

Multi task learning on multiple related networks

CIKM '10 Proceedings of the 19th ACM international conference on Information and knowledge management
Privacy-aware spam detection in social bookmarking systems

i-KNOW '11 Proceedings of the 11th International Conference on Knowledge Management and Knowledge Technologies
Text mining and probabilistic language modeling for online review spam detection

ACM Transactions on Management Information Systems (TMIS)
Comment spam detection by sequence mining

Proceedings of the fifth ACM international conference on Web search and data mining
Survey on mining subjective data on the web

Data Mining and Knowledge Discovery

Quantified Score

Hi-index	0.00

Visualization

Abstract

Social media are becoming increasingly popular and have attracted considerable attention from spammers. Using a sample of more than ninety thousand known spam Web sites, we found between 7% to 18% of their URLs are posted on two popular social media Web sites, digg.com and delicious.com. In this paper, we present a co-classification framework to detect Web spam and the spammers who are responsible for posting them on the social media Web sites. The rationale for our approach is that since both detection tasks are related, it would be advantageous to train them simultaneously to make use of the labeled examples in the Web spam and spammer training data. We have evaluated the effectiveness of our algorithm on the delicious.com data set. Our experimental results showed that the proposed co-classification algorithm significantly outperforms classifiers that learn each detection task independently.