Whom to ask?: jury selection for decision making tasks on micro-blog services

  • Authors:
  • Caleb Chen Cao;Jieying She;Yongxin Tong;Lei Chen

  • Affiliations:
  • The Hong Kong University of Science and Technology, Kowloon, Hong Kong SAR, PR China;The Hong Kong University of Science and Technology, Kowloon, Hong Kong SAR, PR China;The Hong Kong University of Science and Technology, Kowloon, Hong Kong SAR, PR China;The Hong Kong University of Science and Technology, Kowloon, Hong Kong SAR, PR China

  • Venue:
  • Proceedings of the VLDB Endowment
  • Year:
  • 2012

Quantified Score

Hi-index 0.00

Visualization

Abstract

It is universal to see people obtain knowledge on micro-blog services by asking others decision making questions. In this paper, we study the Jury Selection Problem(JSP) by utilizing crowdsourcing for decision making tasks on micro-blog services. Specifically, the problem is to enroll a subset of crowd under a limited budget, whose aggregated wisdom via Majority Voting scheme has the lowest probability of drawing a wrong answer(Jury Error Rate-JER). Due to various individual error-rates of the crowd, the calculation of JER is non-trivial. Firstly, we explicitly state that JER is the probability when the number of wrong jurors is larger than half of the size of a jury. To avoid the exponentially increasing calculation of JER, we propose two efficient algorithms and an effective bounding technique. Furthermore, we study the Jury Selection Problem on two crowdsourcing models, one is for altruistic users(AltrM) and the other is for incentive-requiring users(PayM) who require extra payment when enrolled into a task. For the AltrM model, we prove the monotonicity of JER on individual error rate and propose an efficient exact algorithm for JSP. For the PayM model, we prove the NP-hardness of JSP on PayM and propose an efficient greedy-based heuristic algorithm. Finally, we conduct a series of experiments to investigate the traits of JSP, and validate the efficiency and effectiveness of our proposed algorithms on both synthetic and real micro-blog data.