Syntactic Segmentation and Labeling of Digitized Pages from Technical Journals
IEEE Transactions on Pattern Analysis and Machine Intelligence
Bayesian Networks Classifiers Applied to Documents
ICPR '02 Proceedings of the 16 th International Conference on Pattern Recognition (ICPR'02) Volume 1 - Volume 1
Logical Labeling Using Bayesien Networks
ICDAR '01 Proceedings of the Sixth International Conference on Document Analysis and Recognition
Pattern Recognition, Third Edition
Pattern Recognition, Third Edition
Accessibility of board and presentations in the classroom: a design-for-all approach
Telehealth/AT '08 Proceedings of the IASTED International Conference on Telehealth/Assistive Technologies
Modeling reader's emotional state response on document's typographic elements
Advances in Human-Computer Interaction
Hi-index | 0.00 |
The wide-spread applications of document digitization have lead to the use of structured digital representation methods such as the XML language. Extraction methodologies for the formatting metadata can be used on such structured documents for enhancing their accessibility, including augmented audio representation of documents. To the best of our knowledge, an effort has yet to be made to produce an automatic extraction system of semantic information of the document formatting, solely from document layout, without the use of natural language processing. In this study a corpus of XML representations of several issues of a Greek newspaper is used in order to create and evaluate a semantic classifier of text formatting, based on Bayesian Networks.