A Tabular Survey of Automated Table Processing
GREC '99 Selected Papers from the Third International Workshop on Graphics Recognition, Recent Advances
Tabular abstraction, editing, and formatting
Tabular abstraction, editing, and formatting
A survey of table recognition: Models, observations, transformations, and inferences
International Journal on Document Analysis and Recognition
Using visual cues for extraction of tabular data from arbitrary HTML documents
WWW '05 Special interest tracks and posters of the 14th international conference on World Wide Web
Towards Ontology Generation from Tables
World Wide Web
From Tessellations to Table Interpretation
Calculemus '09/MKM '09 Proceedings of the 16th Symposium, 8th International Conference. Held as Part of CICM '09 on Intelligent Computer Mathematics
Analysis and taxonomy of column header categories for web tables
DAS '10 Proceedings of the 9th IAPR International Workshop on Document Analysis Systems
Notes on contemporary table recognition
DAS'06 Proceedings of the 7th international conference on Document Analysis Systems
Hi-index | 0.00 |
Two hundred web tables from ten sites were imported into Excel. The tables were edited as needed, then converted into layout independent Wang Notation using the Table Abstraction Tool (TAT). The output generated by TAT consists of XML files to be used for constructing narrow-domain ontologies. On an average each table required 104 seconds for editing. Augmentations like aggregates, footnotes, table titles, captions, units and notes were also extracted in an average time of 93 seconds. Every user intervention was logged and audited. The logged interactions were analyzed to determine the relative influence of factors like table size, number of categories and various types of augmentations on the processing time. The analysis suggests which aspects of interactive table processing can be automated in the near term, and how much time such automation would save. The correlation coefficient between predicted and actual processing time was 0.66.