An Optimization Methodology for Document Structure Extraction on Latin Character Documents

Authors:
Jiseng Liang;Ihsin T. Phillips;Robert M. Haralick
Affiliations:
Insightful Corp., Seattle, WA;Queens College, City Univ. of New York, Flushing, NY;Graduate Center, City Univ. of New York, New York, NY
Venue:
IEEE Transactions on Pattern Analysis and Machine Intelligence
Year:
2001

Citing 14
Cited 8

Postal Address Block Location in Real Time

Computer
Automated Evaluation of OCR Zoning

IEEE Transactions on Pattern Analysis and Machine Intelligence
Document layout analysis using recursive morphological transforms

Document layout analysis using recursive morphological transforms
Document Representation and Its Application to Page Decomposition

IEEE Transactions on Pattern Analysis and Machine Intelligence
Segmentation of page images using the area Voronoi diagram

Computer Vision and Image Understanding - Special issue on document image understanding and retrieval
Introduction to Bayesian Networks

Introduction to Bayesian Networks
Numerical Recipes in C: The Art of Scientific Computing

Numerical Recipes in C: The Art of Scientific Computing
Computer and Robot Vision

Computer and Robot Vision
Multiscale Segmentation of Unstructured Document Pages Using Soft Decision Integration

IEEE Transactions on Pattern Analysis and Machine Intelligence
Document Layout Structure Extraction Using Bounding Boxes of Different Entities

WACV '96 Proceedings of the 3rd IEEE Workshop on Applications of Computer Vision (WACV '96)
Automatic text skew estimation in document images

ICDAR '95 Proceedings of the Third International Conference on Document Analysis and Recognition (Volume 2) - Volume 2
Clustering and classification of document structure-a machine learning approach

ICDAR '95 Proceedings of the Third International Conference on Document Analysis and Recognition (Volume 2) - Volume 2
Evaluating spatial correspondence of zones in document recognition systems

ICIP '95 Proceedings of the 1995 International Conference on Image Processing (Vol. 3)-Volume 3 - Volume 3
Block selection: a method for segmenting a page image of various editing styles

ICDAR '95 Proceedings of the Third International Conference on Document Analysis and Recognition (Volume 1) - Volume 1

A Theoretical Foundation and a Method for Document Table Structure Extraction and Decompositon

DAS '02 Proceedings of the 5th International Workshop on Document Analysis Systems V
Data GroundTruth, Complexity, and Evaluation Measures for Color Document Analysis

DAS '02 Proceedings of the 5th International Workshop on Document Analysis Systems V
Two Geometric Algorithms for Layout Analysis

DAS '02 Proceedings of the 5th International Workshop on Document Analysis Systems V
Visually guided bottom-up table detection and segmentation in web documents

Proceedings of the 15th international conference on World Wide Web
Document zone content classification and its performance evaluation

Pattern Recognition
Performance characterization in computer vision: A guide to best practices

Computer Vision and Image Understanding
Spatial Relation Based Object Extraction from the World Wide Web

WI-IAT '08 Proceedings of the 2008 IEEE/WIC/ACM International Conference on Web Intelligence and Intelligent Agent Technology - Volume 03
Applying preattentive visual guidance in document image analysis

IWICPAS'06 Proceedings of the 2006 Advances in Machine Vision, Image Processing, and Pattern Analysis international conference on Intelligent Computing in Pattern Analysis/Synthesis

Quantified Score

Hi-index	0.14

Visualization

Abstract

In this paper, we give a formal definition of a document image structure representation and we formulate document image structure extraction as a partitioning problem: Finding an optimal solution partitioning the set of glyphs of an input document image into a hierarchical tree structure where entities within the hierarchy at each level have similar physical properties and compatable semantic labels. We present a unified methodology that is applicable to construction of document structures at different hierarchical levels. An iterative, relaxation-like method is used to find a partitioning solution that maximizes the probability of the extracted structure. All the probabilities used in the partioning process are estimated from an extensive training set of various kinds of measurements among the entities within the hierarchy. The offline probabilities estimated in the training then drive all decisions in the online document structure extraction. We have implemented a text line extraction algorithm using this framework. The algorithm was evaluated on the UW-III database of some 1,600 scanned document image pages. An area-overlap measure is used to find the correspondence between the detected entities and the ground-truth. For a total of 105,020 text lines, the text line extraction algorithm identifies and segments 104,773 correctly, an accuracy of 99.76 percent. The detail of the algorithm is presented in this paper.