Document image retrieval without OCRing using a video scanning system

  • Authors:
  • Ercan E. Kuruoglu;Vern T. Tan

  • Affiliations:
  • Istituto di Elaborazione della Informazione, CNR, Area della Ricerca di Pisa, Via Alfieri, 1, I-56010 Ghezzano (Pisa), Italy;Trinity College, Cambridge, CB2 1TQ, UK

  • Venue:
  • MULTIMEDIA '00 Proceedings of the 2000 ACM workshops on Multimedia
  • Year:
  • 2000

Quantified Score

Hi-index 0.00

Visualization

Abstract

In this paper, we propose a technique for efficient document retrieval from digital libraries containing document images which are token based compressed. The query image is captured from a paper document by the video scanning tool of a multimedia system. The technique we propose uses the layout information supplied by the relative positions of the character tokens on the page of a “query” paper document to retrieve the original document in the image database. This technique avoids OCRing the query document and the documents in the database; moreover avoids decompressing the token based compressed documents in the database, therefore achieving important time and computational gains.