Sentence ranking for document indexing

  • Authors:
  • Saptaditya Maiti;Deba P. Mandal;Pabitra Mitra

  • Affiliations:
  • Machine Intelligence Unit, Indian Statistical Institute, Kolkata, India;Machine Intelligence Unit, Indian Statistical Institute, Kolkata, India;Dept. of Computer Science and Engineering, Indian Institute of Technology, Kharagpur, India

  • Venue:
  • PReMI'11 Proceedings of the 4th international conference on Pattern recognition and machine intelligence
  • Year:
  • 2011

Quantified Score

Hi-index 0.00

Visualization

Abstract

This article discusses a new document indexing scheme for information retrieval. For a structured (e.g., scientific) document, Pasi et al. proposed varying weights to different sections according to their importance in the document. This concept is extended here to unstructured documents. Each sentence in a document is initially assigned weight (significance in the document) with the help of a summarization technique. Accordingly, the term frequency of a term is decided as the sum of weights of the sentences the term belongs. The method is verified on a real life dataset using leading existing information retrieval models, and its performance has been found to be superior to conventional indexing schemes.