Sequence clustering and labeling for unsupervised query intent discovery

  • Authors:
  • Jackie Chi Kit Cheung;Xiao Li

  • Affiliations:
  • University of Toronto, Toronto, ON, Canada;Facebook Inc., Palo Alto, CA, USA

  • Venue:
  • Proceedings of the fifth ACM international conference on Web search and data mining
  • Year:
  • 2012

Quantified Score

Hi-index 0.00

Visualization

Abstract

One popular form of semantic search observed in several modern search engines is to recognize query patterns that trigger instant answers or domain-specific search, producing semantically enriched search results. This often requires understanding the query intent in addition to the meaning of the query terms in order to access structured data sources. A major challenge in intent understanding is to construct a domain-dependent schema and to annotate search queries based on such a schema, a process that to date has required much manual annotation effort. We present an unsupervised method for clustering queries with similar intent and for producing a pattern consisting of a sequence of semantic concepts and/or lexical items for each intent. Furthermore, we leverage the discovered intent patterns to automatically annotate a large number of queries beyond those used in clustering. We evaluated our method on 10 selected domains, discovering over 1400 intent patterns and automatically annotating 125K (and potentially many more) queries. We found that over 90% of patterns and 80% of instance annotations tested are judged to be correct by a majority of annotators.