Can click patterns across user's query logs predict answers to definition questions?

  • Authors:
  • Alejandro Figueroa

  • Affiliations:
  • Yahoo! Research Latin America Blanco Encalada, Santiago, Chile

  • Venue:
  • EACL '12 Proceedings of the 13th Conference of the European Chapter of the Association for Computational Linguistics
  • Year:
  • 2012

Quantified Score

Hi-index 0.00

Visualization

Abstract

In this paper, we examined click patterns produced by users of Yahoo! search engine when prompting definition questions. Regularities across these click patterns are then utilized for constructing a large and heterogeneous training corpus for answer ranking. In a nutshell, answers are extracted from clicked web-snippets originating from any class of web-site, including Knowledge Bases (KBs). On the other hand, nonanswers are acquired from redundant pieces of text across web-snippets. The effectiveness of this corpus was assessed via training two state-of-the-art models, wherewith answers to unseen queries were distinguished. These testing queries were also submitted by search engine users, and their answer candidates were taken from their respective returned web-snippets. This corpus helped both techniques to finish with an accuracy higher than 70%, and to predict over 85% of the answers clicked by users. In particular, our results underline the importance of non-KB training data.