Extraction of Type Style Based Meta-Information from Imaged Documents

  • Authors:
  • U. Garain;B. B. Chaudhuri

  • Affiliations:
  • -;-

  • Venue:
  • ICDAR '99 Proceedings of the Fifth International Conference on Document Analysis and Recognition
  • Year:
  • 1999

Quantified Score

Hi-index 0.00

Visualization

Abstract

Extraction of some meta-information from printed documents without OCR approach is considered. It can be statistically verified that important terms in articles are printed in italic, bold and all capital style. Detection of these type styles helps in automatic extraction of the lines containing titles, authors' names, subtitles, references as well as sentences having important terms occurring in the text. It also helps in improving the OCR performance for reading the italicized text. Some experimental results on the performance of the approach on good quality as well as degraded document images are presented.