A novel composite kernel for finding similar questions in CQA services

  • Authors:
  • Jun Wang;Zhoujun Li;Xia Hu;Biyun Hu

  • Affiliations:
  • School of Computer Science and Engineering, Beihang University, Beijing, China;School of Computer Science and Engineering, Beihang University, Beijing, China;School of Computing, National University of Singapore, Singapore;School of Computer Science and Engineering, Beihang University, Beijing, China

  • Venue:
  • WAIM'10 Proceedings of the 11th international conference on Web-age information management
  • Year:
  • 2010

Quantified Score

Hi-index 0.00

Visualization

Abstract

Finding similar questions in Community Question Answering (CQA) services plays more and more important role in current web and IR applications. The task aims to retrieve historical questions that are similar or relevant to new questions posed by users. However, traditional "bag-of-words" based models would fail to measure the similarity between question sentences, as they usually ignore sequential and syntactic information. In this paper, we propose a novel composite kernel to improve the accuracy in question matching. Our study illustrate that the composite kernel can efficiently capture both lexical semantics and syntactic information in a question sentence by leveraging word sequence kernel, POS tag sequence kernel and syntactic tree kernel. Experimental results on real world datasets show that our proposed method significantly outperforms the state-of-the-art models.