Compilers: principles, techniques, and tools
Compilers: principles, techniques, and tools
The number of tilings of a block with blocks
European Journal of Combinatorics
Efficient Computation of LALR(1) Look-Ahead Sets
ACM Transactions on Programming Languages and Systems (TOPLAS)
Syntactic Segmentation and Labeling of Digitized Pages from Technical Journals
IEEE Transactions on Pattern Analysis and Machine Intelligence
Model-Based Analysis of Printed Tables
Selected Papers from the First International Workshop on Graphics Recognition, Methods and Applications
A Tabular Survey of Automated Table Processing
GREC '99 Selected Papers from the Third International Workshop on Graphics Recognition, Recent Advances
Issues in Ground-Truthing Graphic Documents
GREC '01 Selected Papers from the Fourth International Workshop on Graphics Recognition Algorithms and Applications
Model-based analysis of printed tables
ICDAR '95 Proceedings of the Third International Conference on Document Analysis and Recognition (Volume 1) - Volume 1
Why Table Ground-Truthing is Hard
ICDAR '01 Proceedings of the Sixth International Conference on Document Analysis and Recognition
Tabular abstraction, editing, and formatting
Tabular abstraction, editing, and formatting
A survey of table recognition: Models, observations, transformations, and inferences
International Journal on Document Analysis and Recognition
Foundations of Multidimensional and Metric Data Structures (The Morgan Kaufmann Series in Computer Graphics and Geometric Modeling)
Towards Ontology Generation from Tables
World Wide Web
Towards domain-independent information extraction from web tables
Proceedings of the 16th international conference on World Wide Web
The Unreasonable Effectiveness of Data
IEEE Intelligent Systems
Automatic hidden-web table interpretation by sibling page comparison
ER'07 Proceedings of the 26th international conference on Conceptual modeling
Notes on contemporary table recognition
DAS'06 Proceedings of the 7th international conference on Document Analysis Systems
Interactive conversion of web tables
GREC'09 Proceedings of the 8th international conference on Graphics recognition: achievements, challenges, and evolution
IEA/AIE'11 Proceedings of the 24th international conference on Industrial engineering and other applications of applied intelligent systems conference on Modern approaches in applied intelligence - Volume Part I
Hi-index | 0.00 |
The extraction of the relations of nested table headers to content cells is automated with a view to constructing narrow domain ontologies of semi-structured web data. A taxonomy of tessellations for displaying tabular data is developed. X-Y tessellations that can be obtained by a divide-and-conquer method are asymptotically only an infinitesimal fraction of all partitions of a rectangle into rectangles. Admissible tessellations are the even smaller subset of all partitions that correspond to the structures of published tables and that contain only rectangles produced by successive guillotine cuts. Many of these can be processed automatically. Their structures can be conveniently represented by X-Y trees, which facilitate relating hierarchical row and column headings to content cells. A formal grammar is proposed for characterizing the X-Y trees of layout-equivalent admissible tessellations. Algorithms are presented for transforming a tessellation into an X-Y tree and hence into multidimensional, layout- independent Category Trees (Wang abstract data types).