Document Images Analysis Solutions for Digital libraries

  • Authors:
  • F. Le Bourgeois;E. Trinh;B. Allier;V. Eglin;H. Emptoz

  • Affiliations:
  • -;-;-;-;-

  • Venue:
  • DIAL '04 Proceedings of the First International Workshop on Document Image Analysis for Libraries (DIAL'04)
  • Year:
  • 2004

Quantified Score

Hi-index 0.00

Visualization

Abstract

Today the development of digital libraries is reaching technological limits due to the difficulty of automatically processing a growing mass of digitized images of documents from different origins. The main problem is the high cost of the digitization and retro-conversion processes which include image capture and indexation, metadata extraction, image storage, conversion in reusable electronic form, publication on the Internet and reduction of image weights for faster access. To reduce the cost of digitization and retro-conversion, we need to break technological bottlenecks like the development of "intelligent" digitizers which reduce manual intervention and produce the best quality images. Retro-conversion needs efficient software which analyze images contents and automatically extract all necessary information for image indexing. Other technological bottlenecks must also be considered like the need of an open file format, which can describe digitized documents as heterogeneous media. This paper is not state-of-the-art in this domain, it just describes some cases, which we have studied in our laboratory during the past years.