Segmentation of Text and Graphics from Document Images

  • Authors:
  • S. Chowdhury;S. Mandal;A. Das;B. Chanda

  • Affiliations:
  • University, Shibpur, India;University, Shibpur, India;University, Shibpur, India;Indian Statistical Institute

  • Venue:
  • ICDAR '07 Proceedings of the Ninth International Conference on Document Analysis and Recognition - Volume 02
  • Year:
  • 2007

Quantified Score

Hi-index 0.00

Visualization

Abstract

Text, graphics and half-tones are the major constituents of any document page. While half-tone can be characterised by its inherent intensity variation, text and graphics share common characteristics except difference in spatial distri- bution. The success of document image analysis systems depends on the proper segmentation of text and graphics as text is further subdivided into other classes such as heading, table and math-zones. Segmentation of graphics is essential for better OCR performance and vectorization in computer vision applications. Graphics segmentation from text is par- ticularly difficult in the context of graphics made of small components (dashed or dotted lines etc.) which have many features similar to texts. Here we propose a robust tech- nique for segmenting all sorts of graphics and texts in any orientation from document pages.