Geometric Structure Analysis of Document Images: A Knowledge-Based Approach
IEEE Transactions on Pattern Analysis and Machine Intelligence
Retrieval by Layout Similarity of Documents Represented with MXY Trees
DAS '02 Proceedings of the 5th International Workshop on Document Analysis Systems V
Page Classification for Meta-data Extraction from Digital Collections
DEXA '01 Proceedings of the 12th International Conference on Database and Expert Systems Applications
Hidden Tree Markov Models for Document Image Classification
IEEE Transactions on Pattern Analysis and Machine Intelligence
Using tree-grammars for training set expansion in page classification
ICDAR '03 Proceedings of the Seventh International Conference on Document Analysis and Recognition - Volume 2
A Hierarchical Method for Automated Identification and Segmentation of Forms
ICDAR '05 Proceedings of the Eighth International Conference on Document Analysis and Recognition
DAS '10 Proceedings of the 9th IAPR International Workshop on Document Analysis Systems
Bottom-up document segmentation method based on textural features
Pattern Recognition and Image Analysis
Clustered trie structures for approximate search in hierarchical objects collections
ICAPR'05 Proceedings of the Third international conference on Advances in Pattern Recognition - Volume Part I
A practical method for compatibility evaluation of portable document formats
ACIIDS'13 Proceedings of the 5th Asian conference on Intelligent Information and Database Systems - Volume Part II
Algorithm for segmentation of documents based on texture features
Pattern Recognition and Image Analysis
Reflowing and annotating scientific papers on eBook readers
Proceedings of the 2013 ACM symposium on Document engineering
Hi-index | 0.00 |
In this paper we describe a top-down approach to the segmentation and representation of documents containing tabular structures. Examples of these documents are invoices and technical papers with tables. The segmentation is based on an extension of X-Y trees, where the regions are split by means of cuts along separators (e.g. lines), in addition to cuts along white spaces. The leaves describe regions containing homogeneous information and cutting separators. Adjacency links among leaves of the tree describe local relationships between corresponding regions.