Understanding Digital Documents Using Gestalt Properties of Isothetic Components
International Journal of Digital Library Systems
Hi-index | 0.00 |
This paper presents a method of document image segmentation for pages with document components of arbitrary shape as well as any skew angles. The characteristics of the proposed method are as follows: (1) The Voronoi diagram is constructed based on the connected components to obtain the candidates of boundaries of document components. (2) The candidates are utilized to estimate the inter-character, inter-line and inter-column gaps without the use of domain specific parameters so as to select the boundaries. From the experimental results for 80 images with non-Manhattan layout and the skews of 0^\circ \sim 45^\circ, we have confirmed that the method is effective for extraction of column regions and as efficient as other methods based on connected component analysis.