Document image analysis for active reading

  • Authors:
  • Claudie Faure;Nicole Vincent

  • Affiliations:
  • CNRS -- LTCI, Paris cedex;CRIP5 -- Université Paris Descartes, Paris cedex

  • Venue:
  • SADPI '07 Proceedings of the 2007 international workshop on Semantically aware document processing and indexing
  • Year:
  • 2007

Quantified Score

Hi-index 0.00

Visualization

Abstract

A huge number of documents that were only available in libraries are now on the web. The web access is a solution to protect the cultural heritage and to facilitate knowledge transmission. Most of these documents are displayed as images of the original paper pages and are indexed by hand. In this paper, we present how and why Document Image Analysis contributes to build the Digital Libraries of the future. Readers expect human-centred interactive reading stations, which imply the production of hyperdocuments to fit the reader's intentions and needs. Image analysis allows extracting and categorizing the meaningful document components and relationships; it also provides readers' adapted visualisation of the original images. Document Image Analysis is an essential prerequisite to enrich hyperdocuments that support content-based readers' activities such as information seeking and navigation. This paper focuses the function of the original image: a reference for the reader and the input data that are processed to automatically detect what makes sense in a document.