Spam filtering in twitter using sender-receiver relationship

Authors:
Jonghyuk Song;Sangho Lee;Jong Kim
Affiliations:
Dept. of CSE, POSTECH, Republic of Korea;Dept. of CSE, POSTECH, Republic of Korea;Div. of ITCE, POSTECH, Republic of Korea
Venue:
RAID'11 Proceedings of the 14th international conference on Recent Advances in Intrusion Detection
Year:
2011

Citing 15
Cited 6

The Perron-Frobenius theorem and the ranking of football teams

SIAM Review
A Survey of Eigenvector Methods for Web Information Retrieval

SIAM Review
Detecting spam web pages through content analysis

Proceedings of the 15th international conference on World Wide Web
SybilGuard: defending against sybil attacks via social networks

Proceedings of the 2006 conference on Applications, technologies, architectures, and protocols for computer communications
SybilLimit: A Near-Optimal Social Network Defense against Sybil Attacks

SP '08 Proceedings of the 2008 IEEE Symposium on Security and Privacy
Social spam detection

Proceedings of the 5th International Workshop on Adversarial Information Retrieval on the Web
Identifying suspicious URLs: an application of large-scale online learning

ICML '09 Proceedings of the 26th Annual International Conference on Machine Learning
A survey of learning-based techniques of email spam filtering

Artificial Intelligence Review
The WEKA data mining software: an update

ACM SIGKDD Explorations Newsletter
What is Twitter, a social network or a news media?

Proceedings of the 19th international conference on World wide web
Uncovering social spammers: social honeypots + machine learning

Proceedings of the 33rd international ACM SIGIR conference on Research and development in information retrieval
@spam: the underground on 140 characters or less

Proceedings of the 17th ACM conference on Computer and communications security
Detecting spammers on social networks

Proceedings of the 26th Annual Computer Security Applications Conference
Who is tweeting on Twitter: human, bot, or cyborg?

Proceedings of the 26th Annual Computer Security Applications Conference
Design and Evaluation of a Real-Time URL Spam Filtering Service

SP '11 Proceedings of the 2011 IEEE Symposium on Security and Privacy

Suspended accounts in retrospect: an analysis of twitter spam

Proceedings of the 2011 ACM SIGCOMM conference on Internet measurement conference
Poultry markets: on the underground economy of twitter followers

Proceedings of the 2012 ACM workshop on Workshop on online social networks
Poultry markets: on the underground economy of twitter followers

ACM SIGCOMM Computer Communication Review - Special october issue SIGCOMM '12
Twitter games: how successful spammers pick targets

Proceedings of the 28th Annual Computer Security Applications Conference
Detecting malicious tweets in trending topics using a statistical analysis of language

Expert Systems with Applications: An International Journal
Searching for spam: detecting fraudulent accounts via web search

PAM'13 Proceedings of the 14th international conference on Passive and Active Measurement

Quantified Score

Hi-index	0.00

Visualization

Abstract

Twitter is one of the most visited sites in these days. Twitter spam, however, is constantly increasing. Since Twitter spam is different from traditional spam such as email and blog spam, conventional spam filtering methods are inappropriate to detect it. Thus, many researchers have proposed schemes to detect spammers in Twitter. These schemes are based on the features of spam accounts such as content similarity, age and the ratio of URLs. However, there are two significant problems in using account features to detect spam. First, account features can easily be fabricated by spammers. Second, account features cannot be collected until a number of malicious activities have been done by spammers. This means that spammers will be detected only after they send a number of spam messages. In this paper, we propose a novel spam filtering system that detects spam messages in Twitter. Instead of using account features, we use relation features, such as the distance and connectivity between a message sender and a message receiver, to decide whether the current message is spam or not. Unlike account features, relation features are difficult for spammers to manipulate and can be collected immediately. We collected a large number of spam and non-spam Twitter messages, and then built and compared several classifiers. From our analysis we found that most spam comes from an account that has less relation with a receiver. Also, we show that our scheme is more suitable to detect Twitter spam than the previous schemes.