Layout Recognition of Multi-Kinds of Table-Form Documents
IEEE Transactions on Pattern Analysis and Machine Intelligence
Design of a mathematical expression understanding system
Pattern Recognition Letters
Ambiguity and constraint in mathematical expression recognition
AAAI '98/IAAI '98 Proceedings of the fifteenth national/tenth conference on Artificial intelligence/Innovative applications of artificial intelligence
Logical Structure Analysis of Book Document Images Using Contents Information
ICDAR '97 Proceedings of the 4th International Conference on Document Analysis and Recognition
Form Analysis by Neural Classification of Cells
DAS '98 Selected Papers from the Third IAPR Workshop on Document Analysis Systems: Theory and Practice
ICDAR '97 Proceedings of the 4th International Conference on Document Analysis and Recognition
Understanding mathematical expressions from document images
ICDAR '95 Proceedings of the Third International Conference on Document Analysis and Recognition (Volume 2) - Volume 2
Design of a mathematical expression recognition system
ICDAR '95 Proceedings of the Third International Conference on Document Analysis and Recognition (Volume 2) - Volume 2
ICDAR '99 Proceedings of the Fifth International Conference on Document Analysis and Recognition
Identifying Contents page of Documents
ICPR '96 Proceedings of the International Conference on Pattern Recognition (ICPR '96) Volume III-Volume 7276 - Volume 7276
An automated generation of an electronic library based on document image understanding
ICDAR '95 Proceedings of the Third International Conference on Document Analysis and Recognition (Volume 1) - Volume 1
Mathematics recognition using graph rewriting
ICDAR '95 Proceedings of the Third International Conference on Document Analysis and Recognition (Volume 1) - Volume 1
Region Segmentation for Table Image with Unknown Complex Structure
ICDAR '01 Proceedings of the Sixth International Conference on Document Analysis and Recognition
ICDAR '01 Proceedings of the Sixth International Conference on Document Analysis and Recognition
Part-of-Speech Tagging for Table of Contents Recognition
ICPR '00 Proceedings of the International Conference on Pattern Recognition - Volume 4
Detection, Extraction and Representation of Tables
ICDAR '03 Proceedings of the Seventh International Conference on Document Analysis and Recognition - Volume 1
Automated Detection and Segmentation of Table of Contents Page from Document Images
ICDAR '03 Proceedings of the Seventh International Conference on Document Analysis and Recognition - Volume 1
Text-mining based journal splitting
ICDAR '03 Proceedings of the Seventh International Conference on Document Analysis and Recognition - Volume 2
Automated Segmentation of Math-Zones from Document Images
ICDAR '03 Proceedings of the Seventh International Conference on Document Analysis and Recognition - Volume 2
Automated Detection and Segmentation of Table of Contents Page and Index Pages from Document Images
ICIAP '03 Proceedings of the 12th International Conference on Image Analysis and Processing
A survey of table recognition: Models, observations, transformations, and inferences
International Journal on Document Analysis and Recognition
Distinguishing Mathematics Notation from English Text using Computational Geometry
ICDAR '05 Proceedings of the Eighth International Conference on Document Analysis and Recognition
Detection and Segmentation of Table of Contents and Index Pages from Document Images
DIAL '06 Proceedings of the Second International Conference on Document Image Analysis for Libraries
Hi-index | 0.00 |
This paper presents a unified algorithm for segmentation and identification of various tabular structures from document page images. Such tabular structures include conventional tables and displayed math-zones, as well as Table of Contents TOC and Index pages. After analyzing the page composition, the algorithm initially classifies the input set of document pages into tabular and non-tabular pages. A tabular page contains at least one of the tabular structures, whereas a non-tabular page does not contain any. The approach is unified in the sense that it is able to identify all tabular structures from a tabular page, which leads to a considerable simplification of document image segmentation in a novel manner. Such unification also results in speeding up the segmentation process, because the existing methodologies produce time-consuming solutions for treating different tabular structures as separate physical entities. Distinguishing features of different kinds of tabular structures have been used in stages in order to ensure the simplicity and efficiency of the algorithm and demonstrated by exhaustive experimental results.