Open entity extraction from web search query logs

  • Authors:
  • Alpa Jain;Marco Pennacchiotti

  • Affiliations:
  • Yahoo! Labs;Yahoo! Labs

  • Venue:
  • COLING '10 Proceedings of the 23rd International Conference on Computational Linguistics
  • Year:
  • 2010

Quantified Score

Hi-index 0.00

Visualization

Abstract

In this paper we propose a completely unsupervised method for open-domain entity extraction and clustering over query logs. The underlying hypothesis is that classes defined by mining search user activity may significantly differ from those typically considered over web documents, in that they better model the user space, i.e. users' perception and interests. We show that our method outperforms state of the art (semi-)supervised systems based either on web documents or on query logs (16% gain on the clustering task). We also report evidence that our method successfully supports a real world application, namely keyword generation for sponsored search.