Fighting Spam on Social Web Sites: A Survey of Approaches and Future Challenges
IEEE Internet Computing
Weighted graphs and disconnected components: patterns and a generator
Proceedings of the 14th ACM SIGKDD international conference on Knowledge discovery and data mining
The anti-social tagger: detecting spam in social bookmarking systems
AIRWeb '08 Proceedings of the 4th international workshop on Adversarial information retrieval on the web
Hyperincident connected components of tagging networks
Proceedings of the 20th ACM conference on Hypertext and hypermedia
Foundations and Trends in Information Retrieval
A Local Method for ObjectRank Estimation
Proceedings of International Conference on Information Integration and Web-based Applications & Services
Hi-index | 0.00 |
Spammers in social bookmarking systems try to mimick bookmarking behaviour of real users to gain the attention of other users or search engines. Several methods have been proposed for the detection of such spam, including domain-specific features (like URL terms) or similarity of users to previously identified spammers. However, as shown in our previous work, it is possible to identify a large fraction of spam users based on purely structural features. The hypergraph connecting documents, users, and tags can be decomposed into connected components, and any large, but non-giant components turned out to be almost entirely inhabitated by spam users in the examined dataset. Here, we test to what degree the decomposition of the complete hypergraph is really necessary, examining the component structure of the induced user/document and user/tag graphs. While the user/tag graph's connectivity does not help in classifying spammers, the user/document graph's connectivity is already highly informative. It can however be augmented with connectivity information from the hypergraph. In our view, spam detection based on structural features, like the one proposed here, requires complex adaptation strategies from spammers and may complement other, more traditional detection approaches.