Word image based latent semantic indexing for conceptual querying in document image databases

  • Authors:
  • Sameek Banerjee;Gaurav Harit;Santanu Chaudhury

  • Affiliations:
  • IIT Delhi, New Delhi, India;IIT Delhi, New Delhi, India;IIT Delhi, New Delhi, India

  • Venue:
  • ICDAR '07 Proceedings of the Ninth International Conference on Document Analysis and Recognition - Volume 02
  • Year:
  • 2007

Quantified Score

Hi-index 0.00

Visualization

Abstract

In this paper we present an application of latent semantic analysis (LSA) for indexing and retrieval of document images with text. The query is specified as a set of word images and the documents which best match with the query representation in the the latent semantic space are retrieved. We show through extensive experiments on a large database that use of LSA for document images provides improvements in retrieval precision as is the case with electronic text documents.