Instance-Based Learning Algorithms
Machine Learning
Machine Learning
An extensive empirical study of feature selection metrics for text classification
The Journal of Machine Learning Research
Communications of the ACM
All your contacts are belong to us: automated identity theft attacks on social networks
Proceedings of the 18th international conference on World wide web
Detecting spammers on social networks
Proceedings of the 26th Annual Computer Security Applications Conference
Twitter spammer detection using data stream clustering
Information Sciences: an International Journal
Hi-index | 0.00 |
Social networking sites have become very popular in recent years. Users use them to find new friends, updates their existing friends with their latest thoughts and activities. Among these sites, Twitter is the fastest growing site. Its popularity also attracts many spammers to infiltrate legitimate users' accounts with a large amount of spam messages. In this paper, we discuss some user-based and content-based features that are different between spammers and legitimate users. Then, we use these features to facilitate spam detection. Using the API methods provided by Twitter, we crawled active Twitter users, their followers/ following information and their most recent 100 tweets. Then, we evaluated our detection scheme based on the suggested user and content-based features. Our results show that among the four classifiers we evaluated, the Random Forest classifier produces the best results. Our spam detector can achieve 95.7% precision and 95.7% F-measure using the Random Forest classifier.