On sampling the wisdom of crowds: random vs. expert sampling of the twitter stream

  • Authors:
  • Saptarshi Ghosh;Muhammad Bilal Zafar;Parantapa Bhattacharya;Naveen Sharma;Niloy Ganguly;Krishna Gummadi

  • Affiliations:
  • IIT Kharagpur & MPI-SWS, Kharagpur, India;MPI-SWS, Kaiserslautern, Saarbruecken, Germany;IIT Kharagpur & MPI-SWS, Kharagpur, India;University of Washington, Washington, WA, USA;IIT Kharagpur, Kharagpur, India;MPI-SWS, Kaiserslautern, Saarbruecken, Germany

  • Venue:
  • Proceedings of the 22nd ACM international conference on Conference on information & knowledge management
  • Year:
  • 2013

Quantified Score

Hi-index 0.00

Visualization

Abstract

Several applications today rely upon content streams crowd-sourced from online social networks. Since real-time processing of large amounts of data generated on these sites is difficult, analytics companies and researchers are increasingly resorting to sampling. In this paper, we investigate the crucial question of how to sample the data generated by users in social networks. The traditional method is to randomly sample all the data. We analyze a different sampling methodology, where content is gathered only from a relatively small subset (