Heterogeneous graph-based intent learning with queries, web pages and Wikipedia concepts

  • Authors:
  • Xiang Ren;Yujing Wang;Xiao Yu;Jun Yan;Zheng Chen;Jiawei Han

  • Affiliations:
  • University of Illinois at Urbana-Champaign, Urbana, USA;Microsoft Research, Beijing, China;University of Illinois at Urbana-Champaign, Urbana, USA;Microsoft Research, Beijing, China;Microsoft Research, Beijing, China;University of Illinois at Urbana-Champaign, Urbana, USA

  • Venue:
  • Proceedings of the 7th ACM international conference on Web search and data mining
  • Year:
  • 2014

Quantified Score

Hi-index 0.00

Visualization

Abstract

The problem of learning user search intents has attracted intensive attention from both industry and academia. However, state-of-the-art intent learning algorithms suffer from different drawbacks when only using a single type of data source. For example, query text has difficulty in distinguishing ambiguous queries; search log is bias to the order of search results and users' noisy click behaviors. In this work, we for the first time leverage three types of objects, namely queries, web pages and Wikipedia concepts collaboratively for learning generic search intents and construct a heterogeneous graph to represent multiple types of relationships between them. A novel unsupervised method called heterogeneous graph-based soft-clustering is developed to derive an intent indicator for each object based on the constructed heterogeneous graph. With the proposed co-clustering method, one can enhance the quality of intent understanding by taking advantage of different types of data, which complement each other, and make the implicit intents easier to interpret with explicit knowledge from Wikipedia concepts. Experiments on two real-world datasets demonstrate the power of the proposed method where it achieves a 9.25% improvement in terms of NDCG on search ranking task and a 4.67% enhancement in terms of Rand index on object co-clustering task compared to the best state-of-the-art method.