A study of the integration of passage-, document-, and cluster-based information for re-ranking search results

  • Authors:
  • Eyal Krikon;Oren Kurland

  • Affiliations:
  • Faculty of Industrial Engineering and Management, Technion, Israel Institute of Technology, Haifa, Israel 32000;Faculty of Industrial Engineering and Management, Technion, Israel Institute of Technology, Haifa, Israel 32000

  • Venue:
  • Information Retrieval
  • Year:
  • 2011

Quantified Score

Hi-index 0.00

Visualization

Abstract

Cluster-based and passage-based document retrieval paradigms were shown to be effective. While the former are based on utilizing query-related corpus context manifested in clusters of similar documents, the latter address the fact that a document can be relevant even if only a very small part of it contains query-pertaining information. Hence, cluster-based approaches could be viewed as based on "expanding" the document representation, while passage-based approaches can be thought of as utilizing a "contracted" document representation. We present a study of the relative benefits of using each of these two approaches, and of the potential merits of their integration. To that end, we devise two methods that integrate whole-document-based, cluster-based and passage-based information. The methods are applied for the re-ranking task, that is, re-ordering documents in an initially retrieved list so as to improve precision at the very top ranks. Extensive empirical evaluation attests to the potential merits of integrating these information types. Specifically, the resultant performance substantially transcends that of the initial ranking; and, is often better than that of a state-of-the-art pseudo-feedback-based query expansion approach.