Towards efficient similar sentences extraction

  • Authors:
  • Yanhui Gu;Zhenglu Yang;Miyuki Nakano;Masaru Kitsuregawa

  • Affiliations:
  • Institute of Industrial Science, The University of Tokyo, Japan;Institute of Industrial Science, The University of Tokyo, Japan;Institute of Industrial Science, The University of Tokyo, Japan;Institute of Industrial Science, The University of Tokyo, Japan

  • Venue:
  • IDEAL'12 Proceedings of the 13th international conference on Intelligent Data Engineering and Automated Learning
  • Year:
  • 2012

Quantified Score

Hi-index 0.00

Visualization

Abstract

Similar sentences extraction is an essential issue for many applications, such as natural language processing, Web page retrieval, question-answer model, and so forth. Although there are many studies exploring on this issue, most of them focus on how to improve the effectiveness aspect. In this paper, we address the efficiency issue, i.e., for a given sentence collection, how to efficiently discover the top-k semantic similar sentences to a query. The issue is very important for real applications because the data becomes huge and the existing state-of-the-art strategies cannot satisfy the users' performance requirement. We propose efficient strategies to tackle the problem based on a general framework. Extensive experimental evaluations demonstrate that the efficiency of our proposal outperforms the state-of-the-art approach.