Learning similarity function for rare queries

  • Authors:
  • Jingfang Xu;Gu Xu

  • Affiliations:
  • Microsoft Research Asia, Beijing, China;Microsoft Research Asia, Beijing, China

  • Venue:
  • Proceedings of the fourth ACM international conference on Web search and data mining
  • Year:
  • 2011

Quantified Score

Hi-index 0.00

Visualization

Abstract

The key element of many query processing tasks can be formalized as calculation of similarities between queries. These include query suggestion, query reformulation, and query expansion. Although many methods have been proposed for query similarity calculation, they could perform poorly on rare queries. As far as we know, there was no previous work particularly about rare query similarity calculation, and this paper tries to study this problem. Specifically, we address three problems. Firstly, we define an n-gram space to represent queries with their own content and a similarity function to measure the similarities between queries. Secondly, we propose learning the similarity function by leveraging the training data derived from user behavior data. This is formalized as an optimization problem and a metric learning approach is employed to solve it efficiently. Finally, we exploit locality sensitive hashing for efficient retrieval of similar queries from a large query repository. We experimentally verified the effectiveness of the proposed approach by showing that our method can indeed enhance the accuracy of query similarity calculation for rare queries and efficiently retrieve similar queries. As an application, we also experimentally demonstrated that the similar queries found by our method can significantly improve search relevance.