Linear prediction models with graph regularization for web-page categorization
Proceedings of the 12th ACM SIGKDD international conference on Knowledge discovery and data mining
Detecting splogs via temporal dynamics using self-similarity analysis
ACM Transactions on the Web (TWEB)
WSDM '08 Proceedings of the 2008 International Conference on Web Search and Data Mining
Extracting spam blogs with co-citation clusters
Proceedings of the 17th international conference on World Wide Web
Combating spam in tagging systems: An evaluation
ACM Transactions on the Web (TWEB)
Web spam identification through content and hyperlinks
AIRWeb '08 Proceedings of the 4th international workshop on Adversarial information retrieval on the web
Link Mining for a Social Bookmarking Web Site
WI-IAT '08 Proceedings of the 2008 IEEE/WIC/ACM International Conference on Web Intelligence and Intelligent Agent Technology - Volume 01
Multi task learning on multiple related networks
CIKM '10 Proceedings of the 19th ACM international conference on Information and knowledge management
Privacy-aware spam detection in social bookmarking systems
i-KNOW '11 Proceedings of the 11th International Conference on Knowledge Management and Knowledge Technologies
Text mining and probabilistic language modeling for online review spam detection
ACM Transactions on Management Information Systems (TMIS)
Comment spam detection by sequence mining
Proceedings of the fifth ACM international conference on Web search and data mining
Survey on mining subjective data on the web
Data Mining and Knowledge Discovery
Hi-index | 0.00 |
Social media are becoming increasingly popular and have attracted considerable attention from spammers. Using a sample of more than ninety thousand known spam Web sites, we found between 7% to 18% of their URLs are posted on two popular social media Web sites, digg.com and delicious.com. In this paper, we present a co-classification framework to detect Web spam and the spammers who are responsible for posting them on the social media Web sites. The rationale for our approach is that since both detection tasks are related, it would be advantageous to train them simultaneously to make use of the labeled examples in the Web spam and spammer training data. We have evaluated the effectiveness of our algorithm on the delicious.com data set. Our experimental results showed that the proposed co-classification algorithm significantly outperforms classifiers that learn each detection task independently.