Newspaper Headlines Extraction from Microfilm Images

  • Authors:
  • Qing Hong Liu;Chew Lim Tan

  • Affiliations:
  • -;-

  • Venue:
  • ICPR '02 Proceedings of the 16 th International Conference on Pattern Recognition (ICPR'02) Volume 3 - Volume 3
  • Year:
  • 2002

Quantified Score

Hi-index 0.00

Visualization

Abstract

Automatic indexing is important for a digital library to provide digitized manuscripts of old document images and their electronic text. As an essential step in creating such a system, this paper discusses the issue of extracting headlines from old newspaper microfilms. Most research on document layout analysis has largely assumed relatively clean images. However microfilm images of old newspapers present a challenge. Such images are usually insufficiently illuminated and considerably dirty. To overcome the problem we propose a new effective method for separating characters from noisy background since conventional threshold selection techniques are inadequate to deal with these kinds of images. A Run Length Smearing Algorithm (RLSA) is applied in the headline extraction. Experiment shows that our approach has improved the recall, precision and combined rates.