Detection and segmentation of tables and math-zones from document images

  • Authors:
  • S. Mandal;S. P. Chowdhury;A. K. Das;Bhabatosh Chanda

  • Affiliations:
  • Bengal Engineering and Science University, Shibpur, Howrah, India;Bengal Engineering and Science University, Shibpur, Howrah, India;Bengal Engineering and Science University, Shibpur, Howrah, India;Indian Statistical Institute, Kolkata, India

  • Venue:
  • Proceedings of the 2006 ACM symposium on Applied computing
  • Year:
  • 2006

Quantified Score

Hi-index 0.00

Visualization

Abstract

We propose an algorithm to separate out tables and math-zones from document images. The algorithm relies on the spatial characteristics of tables and math-zones in a document. It has been observed that tables have distinct columns which imply that gaps between the fields are substantially larger than the gaps between the words in text lines and in math-zones the characters and symbols are less dense in comparison to normal text lines. These deceptively simple observations have led us to design a simple but powerful table and math-zone detection system with low computation cost.