Detecting spammers and content promoters in online video social networks

Authors:
Fabrício Benevenuto;Tiago Rodrigues;Virgílio Almeida;Jussara Almeida;Marcos Gonçalves
Affiliations:
Federal University of Minas Gerais, Belo Horizonte, Brazil;Federal University of Minas Gerais, Belo Horizonte, Brazil;Federal University of Minas Gerais, Belo Horizonte, Brazil;Federal University of Minas Gerais, Belo Horizonte, Brazil;Federal University of Minas Gerais, Belo Horizonte, Brazil
Venue:
Proceedings of the 32nd international ACM SIGIR conference on Research and development in information retrieval
Year:
2009

Citing 26
Cited 29

The anatomy of a large-scale hypertextual Web search engine

WWW7 Proceedings of the seventh international conference on World Wide Web 7
Data clustering: a review

ACM Computing Surveys (CSUR)
An Evaluation of Statistical Approaches to Text Categorization

Information Retrieval
Text Categorization with Suport Vector Machines: Learning with Many Relevant Features

ECML '98 Proceedings of the 10th European Conference on Machine Learning
A Comparative Study on Feature Selection in Text Categorization

ICML '97 Proceedings of the Fourteenth International Conference on Machine Learning
Combining Statistical Learning with a Knowledge-Based Approach - A Case Study in Intensive Care Monitoring

ICML '99 Proceedings of the Sixteenth International Conference on Machine Learning
The Eigentrust algorithm for reputation management in P2P networks

WWW '03 Proceedings of the 12th international conference on World Wide Web
Spam, damn spam, and statistics: using statistical analysis to locate spam web pages

Proceedings of the 7th International Workshop on the Web and Databases: colocated with ACM SIGMOD/PODS 2004
Google's PageRank and Beyond: The Science of Search Engine Rankings

Google's PageRank and Beyond: The Science of Search Engine Rankings
Working Set Selection Using Second Order Information for Training Support Vector Machines

The Journal of Machine Learning Research
Workload models of spam and legitimate e-mails

Performance Evaluation
Analysis of topological characteristics of huge online social networking services

Proceedings of the 16th international conference on World Wide Web
Combating spam in tagging systems

AIRWeb '07 Proceedings of the 3rd international workshop on Adversarial information retrieval on the web
MultiTube--Where Web 2.0 and Multimedia Could Meet

IEEE MultiMedia
Know your neighbors: web spam detection using the web topology

SIGIR '07 Proceedings of the 30th annual international ACM SIGIR conference on Research and development in information retrieval
I tube, you tube, everybody tubes: analyzing the world's largest user generated content video system

Proceedings of the 7th ACM SIGCOMM conference on Internet measurement
Youtube traffic characterization: a view from the edge

Proceedings of the 7th ACM SIGCOMM conference on Internet measurement
Measurement and analysis of online social networks

Proceedings of the 7th ACM SIGCOMM conference on Internet measurement
Fighting Spam on Social Web Sites: A Survey of Approaches and Future Challenges

IEEE Internet Computing
Combating web spam with trustrank

VLDB '04 Proceedings of the Thirtieth international conference on Very large data bases - Volume 30
Detecting splogs via temporal dynamics using self-similarity analysis

ACM Transactions on the Web (TWEB)
On Social Networking and Communication Paradigms

IEEE Internet Computing
Spamming botnets: signatures and characteristics

Proceedings of the ACM SIGCOMM 2008 conference on Data communication
Identifying video spammers in online social networks

AIRWeb '08 Proceedings of the 4th international workshop on Adversarial information retrieval on the web
Understanding video interactions in youtube

MM '08 Proceedings of the 16th ACM international conference on Multimedia
A comparison of methods for multiclass support vector machines

IEEE Transactions on Neural Networks

Video interactions in online video social networks

ACM Transactions on Multimedia Computing, Communications, and Applications (TOMCCAP)
Uncovering social spammers: social honeypots + machine learning

Proceedings of the 33rd international ACM SIGIR conference on Research and development in information retrieval
A contextual analysis of the YouTube duplicate content

WebMedia '09 Proceedings of the XV Brazilian Symposium on Multimedia and the Web
Evaluation of users access and navigation profiles on web video sharing environments

WebMedia '09 Proceedings of the XV Brazilian Symposium on Multimedia and the Web
Detecting product review spammers using rating behaviors

CIKM '10 Proceedings of the 19th ACM international conference on Information and knowledge management
Detecting spam bots in online social networking sites: a machine learning approach

DBSec'10 Proceedings of the 24th annual IFIP WG 11.3 working conference on Data and applications security and privacy
Detecting and characterizing social spam campaigns

IMC '10 Proceedings of the 10th ACM SIGCOMM conference on Internet measurement
Detecting spam blogs from blog search results

Information Processing and Management: an International Journal
Phi.sh/$oCiaL: the phishing landscape through short URLs

Proceedings of the 8th Annual Collaboration, Electronic messaging, Anti-Abuse and Spam Conference
Automatic Assessment of Document Quality in Web Collaborative Digital Libraries

Journal of Data and Information Quality (JDIQ)
Text mining and probabilistic language modeling for online review spam detection

ACM Transactions on Management Information Systems (TMIS)
Tips, dones and todos: uncovering user profiles in foursquare

Proceedings of the fifth ACM international conference on Web search and data mining
Characterizing user navigation and interactions in online social networks

Information Sciences: an International Journal
Detecting collective attention spam

Proceedings of the 2nd Joint WICOW/AIRWeb Workshop on Web Quality
Die free or live hard? empirical evaluation and new design for fighting evolving twitter spammers

RAID'11 Proceedings of the 14th international conference on Recent Advances in Intrusion Detection
Analyzing spammers' social networks for fun and profit: a case study of cyber criminal ecosystem on twitter

Proceedings of the 21st international conference on World Wide Web
Spotting fake reviewer groups in consumer reviews

Proceedings of the 21st international conference on World Wide Web
Modeling user posting behavior on social media

SIGIR '12 Proceedings of the 35th international ACM SIGIR conference on Research and development in information retrieval
Detecting tip spam in location-based social networks

Proceedings of the 28th Annual ACM Symposium on Applied Computing
Detection of spam tipping behaviour on foursquare

Proceedings of the 22nd international conference on World Wide Web companion
Faking Sandy: characterizing and identifying fake images on Twitter during Hurricane Sandy

Proceedings of the 22nd international conference on World Wide Web companion
Characterizing video access patterns in mainstream media portals

Proceedings of the 22nd international conference on World Wide Web companion
@i seek 'fb.me': identifying users across multiple online social networks

Proceedings of the 22nd international conference on World Wide Web companion
Community-based features for identifying spammers in online social networks

Proceedings of the 2013 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining
UNIK: unsupervised social network spam detection

Proceedings of the 22nd ACM international conference on Conference on information & knowledge management
Campaign extraction from social media

ACM Transactions on Intelligent Systems and Technology (TIST) - Special Section on Intelligent Mobile Knowledge Discovery and Management Systems and Special Issue on Social Web Mining
Noisy but non-malicious user detection in social recommender systems

World Wide Web
Social Ties in Video Sharing Services: Tactics for Excavating Virtual Settlements

International Journal of Virtual Communities and Social Networking
Discovering content-based behavioral roles in social networks

Decision Support Systems

Quantified Score

Hi-index	0.00

Visualization

Abstract

A number of online video social networks, out of which YouTube is the most popular, provides features that allow users to post a video as a response to a discussion topic. These features open opportunities for users to introduce polluted content, or simply pollution, into the system. For instance, spammers may post an unrelated video as response to a popular one aiming at increasing the likelihood of the response being viewed by a larger number of users. Moreover, opportunistic users--promoters--may try to gain visibility to a specific video by posting a large number of (potentially unrelated) responses to boost the rank of the responded video, making it appear in the top lists maintained by the system. Content pollution may jeopardize the trust of users on the system, thus compromising its success in promoting social interactions. In spite of that, the available literature is very limited in providing a deep understanding of this problem. In this paper, we go a step further by addressing the issue of detecting video spammers and promoters. Towards that end, we manually build a test collection of real YouTube users, classifying them as spammers, promoters, and legitimates. Using our test collection, we provide a characterization of social and content attributes that may help distinguish each user class. We also investigate the feasibility of using a state-of-the-art supervised classification algorithm to detect spammers and promoters, and assess its effectiveness in our test collection. We found that our approach is able to correctly identify the majority of the promoters, misclassifying only a small percentage of legitimate users. In contrast, although we are able to detect a significant fraction of spammers, they showed to be much harder to distinguish from legitimate users.