Finding the right facts in the crowd: factoid question answering over social media

  • Authors:
  • Jiang Bian;Yandong Liu;Eugene Agichtein;Hongyuan Zha

  • Affiliations:
  • Georgia Institute of Technology, Atlanta, GA, USA;Emory University, Atlanta, GA, USA;Emory University, Atlanta, GA, USA;Georgia Institute of Technology, Atlanta, GA, USA

  • Venue:
  • Proceedings of the 17th international conference on World Wide Web
  • Year:
  • 2008

Quantified Score

Hi-index 0.02

Visualization

Abstract

Community Question Answering has emerged as a popular and effective paradigm for a wide range of information needs. For example, to find out an obscure piece of trivia, it is now possible and even very effective to post a question on a popular community QA site such as Yahoo! Answers, and to rely on other users to provide answers, often within minutes. The importance of such community QA sites is magnified as they create archives of millions of questions and hundreds of millions of answers, many of which are invaluable for the information needs of other searchers. However, to make this immense body of knowledge accessible, effective answer retrieval is required. In particular, as any user can contribute an answer to a question, the majority of the content reflects personal, often unsubstantiated opinions. A ranking that combines both relevance and quality is required to make such archives usable for factual information retrieval. This task is challenging, as the structure and the contents of community QA archives differ significantly from the web setting. To address this problem we present a general ranking framework for factual information retrieval from social media. Results of a large scale evaluation demonstrate that our method is highly effective at retrieving well-formed, factual answers to questions, as evaluated on a standard factoid QA benchmark. We also show that our learning framework can be tuned with the minimum of manual labeling. Finally, we provide result analysis to gain deeper understanding of which features are significant for social media search and retrieval. Our system can be used as a crucial building block for combining results from a variety of social media content with general web search results, and to better integrate social media content for effective information access.