Learning with click graph for query intent classification

  • Authors:
  • Xiao Li;Ye-Yi Wang;Dou Shen;Alex Acero

  • Affiliations:
  • Microsoft Research;Microsoft Research;Microsoft Corporation;Microsoft Research

  • Venue:
  • ACM Transactions on Information Systems (TOIS)
  • Year:
  • 2010

Quantified Score

Hi-index 0.00

Visualization

Abstract

Topical query classification, as one step toward understanding users' search intent, is gaining increasing attention in information retrieval. Previous works on this subject primarily focused on enrichment of query features, for example, by augmenting queries with search engine results. In this work, we investigate a completely orthogonal approach—instead of improving feature representation, we aim at drastically increasing the amount of training data. To this end, we propose two semisupervised learning methods that exploit user click-through data. In one approach, we infer class memberships of unlabeled queries from those of labeled ones according to their proximities in a click graph; and then use these automatically labeled queries to train classifiers using query terms as features. In a second approach, click graph learning and query classifier training are conducted jointly with an integrated objective. Our methods are evaluated in two applications, product intent and job intent classification. In both cases, we expand the training data by over two orders of magnitude, leading to significant improvements in classification performance. An additional finding is that with a large amount of training data obtained in this fashion, a classifier based on simple query term features can outperform those using state-of-the-art, augmented features.