Searching for spam: detecting fraudulent accounts via web search

Authors:
Marcel Flores;Aleksandar Kuzmanovic
Affiliations:
Northwestern University;Northwestern University
Venue:
PAM'13 Proceedings of the 14th international conference on Passive and Active Measurement
Year:
2013

Citing 9
Cited 0

Uncovering social spammers: social honeypots + machine learning

Proceedings of the 33rd international ACM SIGIR conference on Research and development in information retrieval
@spam: the underground on 140 characters or less

Proceedings of the 17th ACM conference on Computer and communications security
Detecting and characterizing social spam campaigns

IMC '10 Proceedings of the 10th ACM SIGCOMM conference on Internet measurement
Detecting spammers on social networks

Proceedings of the 26th Annual Computer Security Applications Conference
What have fruits to do with technology?: the case of Orange, Blackberry and Apple

Proceedings of the International Conference on Web Intelligence, Mining and Semantics
Design and Evaluation of a Real-Time URL Spam Filtering Service

SP '11 Proceedings of the 2011 IEEE Symposium on Security and Privacy
Suspended accounts in retrospect: an analysis of twitter spam

Proceedings of the 2011 ACM SIGCOMM conference on Internet measurement conference
Spam filtering in twitter using sender-receiver relationship

RAID'11 Proceedings of the 14th international conference on Recent Advances in Intrusion Detection
Die free or live hard? empirical evaluation and new design for fighting evolving twitter spammers

RAID'11 Proceedings of the 14th international conference on Recent Advances in Intrusion Detection

Quantified Score

Hi-index	0.00

Visualization

Abstract

Twitter users are harassed increasingly often by unsolicited messages that waste time and mislead users into clicking nefarious links. While increasingly powerful methods have been designed to detect spam, many depend on complex methods that require training and analyzing message content. While many of these systems are fast, implementing them in real time could present numerous challenges. Previous work has shown that large portions of spam originate from fraudulent accounts. We therefore propose a system which uses web searches to determine if a given account is fraudulent. The system uses the web searches to measure the online presence of a user and labels accounts with insufficient web presence to likely be fraudulent. Using our system on a collection of actual Twitter messages, we are able to achieve a true positive rate over 74% and a false positive rate below 11%, a detection rate comparable to those achieved by more expensive methods. Given its ability to operate before an account has produced a single tweet, we propose that our system could be used most effectively by combining it with slower more expensive machine learning methods as a first line of defense, alerting the system of fraudulent accounts before they have an opportunity to inject any spam into the ecosystem.