A few bad votes too many?: towards robust ranking in social media

  • Authors:
  • Jiang Bian;Yandong Liu;Eugene Agichtein;Hongyuan Zha

  • Affiliations:
  • Georgia Institute of Technology;Emory University;Emory University;Georgia Institute of Technology

  • Venue:
  • AIRWeb '08 Proceedings of the 4th international workshop on Adversarial information retrieval on the web
  • Year:
  • 2008

Quantified Score

Hi-index 0.00

Visualization

Abstract

Online social media draws heavily on active reader participation, such as voting or rating of news stories, articles, or responses to a question. This user feedback is invaluable for ranking, filtering, and retrieving high quality content - tasks that are crucial with the explosive amount of social content on the web. Unfortunately, as social media moves into the mainstream and gains in popularity, the quality of the user feedback degrades. Some of this is due to noise, but, increasingly, a small fraction of malicious users are trying to "game the system" by selectively promoting or demoting content for profit, or fun. Hence, an effective ranking of social media content must be robust to noise in the user interactions, and in particular to vote spam. We describe a machine learning based ranking framework for social media that integrates user interactions and content relevance, and demonstrate its effec- tiveness for answer retrieval in a popular community question answering portal. We consider several vote spam attacks, and introduce a method of training our ranker to increase its robustness to some common forms of vote spam attacks. The results of our large-scale experimental evaluation show that our ranker is signifcicantly more robust to vote spam compared to a state-of-the-art baseline as well as the ranker not explicitly trained to handle malicious interactions.