C4.5: programs for machine learning
C4.5: programs for machine learning
Data mining: practical machine learning tools and techniques with Java implementations
Data mining: practical machine learning tools and techniques with Java implementations
Data mining: concepts and techniques
Data mining: concepts and techniques
Usage patterns of collaborative tagging systems
Journal of Information Science
Detecting spam web pages through content analysis
Proceedings of the 15th international conference on World Wide Web
Detecting semantic cloaking on the web
Proceedings of the 15th international conference on World Wide Web
An introduction to ROC analysis
Pattern Recognition Letters - Special issue: ROC analysis in pattern recognition
tagging, communities, vocabulary, evolution
CSCW '06 Proceedings of the 2006 20th anniversary conference on Computer supported cooperative work
Improving web spam classifiers using link structure
AIRWeb '07 Proceedings of the 3rd international workshop on Adversarial information retrieval on the web
Combating spam in tagging systems
AIRWeb '07 Proceedings of the 3rd international workshop on Adversarial information retrieval on the web
Know your neighbors: web spam detection using the web topology
SIGIR '07 Proceedings of the 30th annual international ACM SIGIR conference on Research and development in information retrieval
Fighting Spam on Social Web Sites: A Survey of Approaches and Future Challenges
IEEE Internet Computing
Network properties of folksonomies
AI Communications - Network Analysis in Natural Sciences and Engineering
Detecting spam blogs: a machine learning approach
AAAI'06 proceedings of the 21st national conference on Artificial intelligence - Volume 2
LIBSVM: A library for support vector machines
ACM Transactions on Intelligent Systems and Technology (TIST)
Ontologies are us: a unified model of social networks and semantics
ISWC'05 Proceedings of the 4th international conference on The Semantic Web
Information retrieval in folksonomies: search and ranking
ESWC'06 Proceedings of the 3rd European conference on The Semantic Web: research and applications
Collaborative tagging as a tripartite network
ICCS'06 Proceedings of the 6th international conference on Computational Science - Volume Part III
Security in web 2.0 application development
Proceedings of the 10th International Conference on Information Integration and Web-based Applications & Services
Proceedings of the 5th International Workshop on Adversarial Information Retrieval on the Web
Tag spam creates large non-giant connected components
Proceedings of the 5th International Workshop on Adversarial Information Retrieval on the Web
Hyperincident connected components of tagging networks
Proceedings of the 20th ACM conference on Hypertext and hypermedia
Hyperincident connected components of tagging networks
ACM SIGWEB Newsletter
A brief survey of computational approaches in social computing
IJCNN'09 Proceedings of the 2009 international joint conference on Neural Networks
Detecting tag spam in social tagging systems with collaborative knowledge
FSKD'09 Proceedings of the 6th international conference on Fuzzy systems and knowledge discovery - Volume 7
SpamResist: making peer-to-peer tagging systems robust to spam
GLOBECOM'09 Proceedings of the 28th IEEE conference on Global telecommunications
ECDL'09 Proceedings of the 13th European conference on Research and advanced technology for digital libraries
The social bookmark and publication management system bibsonomy
The VLDB Journal — The International Journal on Very Large Data Bases
Social bookmark weighting for search and recommendation
The VLDB Journal — The International Journal on Very Large Data Bases
Foundations and Trends in Information Retrieval
Privacy-aware spam detection in social bookmarking systems
i-KNOW '11 Proceedings of the 11th International Conference on Knowledge Management and Knowledge Technologies
Spam fighting in social tagging systems
SocInfo'12 Proceedings of the 4th international conference on Social Informatics
Detecting Trends in Social Bookmarking Systems: A del.icio.us Endeavor
International Journal of Data Warehousing and Mining
Detecting Social Bookmark Spams Using Multiple User Accounts
ASONAM '12 Proceedings of the 2012 International Conference on Advances in Social Networks Analysis and Mining (ASONAM 2012)
A Novel Framework for Spammer Detection in Social Bookmarking Systems
ASONAM '12 Proceedings of the 2012 International Conference on Advances in Social Networks Analysis and Mining (ASONAM 2012)
Hi-index | 0.01 |
The annotation of web sites in social bookmarking systems has become a popular way to manage and find information on the web. The community structure of such systems attracts spammers: recent post pages, popular pages or specific tag pages can be manipulated easily. As a result, searching or tracking recent posts does not deliver quality results annotated in the community, but rather unsolicited, often commercial, web sites. To retain the benefits of sharing one's web content, spam-fighting mechanisms that can face the flexible strategies of spammers need to be developed. A classical approach in machine learning is to determine relevant features that describe the system's users, train different classifiers with the selected features and choose the one with the most promising evaluation results. In this paper we will transfer this approach to a social bookmarking setting to identify spammers. We will present features considering the topological, semantic and profile-based information which people make public when using the system. The dataset used is a snapshot of the social bookmarking system BibSonomy and was built over the course of several months when cleaning the system from spam. Based on our features, we will learn a large set of different classification models and compare their performance. Our results represent the groundwork for a first application in BibSonomy and for the building of more elaborate spam detection mechanisms.