Learning Non-Generative Grammatical Models for Document Analysis

Authors:
Michael Shilman;Percy Liang;Paul Viola
Affiliations:
Microsoft Research;Microsoft Research;Microsoft Research
Venue:
ICCV '05 Proceedings of the Tenth IEEE International Conference on Computer Vision - Volume 2
Year:
2005

Citing 0
Cited 7

Document image analysis for digital libraries

Proceedings of the 2006 international workshop on Research issues in digital libraries
Force deployment analysis with generalized grammar

Information Fusion
Information Extraction

Foundations and Trends in Databases
Model-Guided Segmentation and Layout Labelling of Document Images Using a Hierarchical Conditional Random Field

PReMI '09 Proceedings of the 3rd International Conference on Pattern Recognition and Machine Intelligence
Text versus non-text distinction in online handwritten documents

Proceedings of the 2010 ACM Symposium on Applied Computing
From layout to semantic: a reranking model for mapping web documents to mediated XML representations

Large Scale Semantic Access to Content (Text, Image, Video, and Sound)
Using grammars for pattern recognition in images: A systematic review

ACM Computing Surveys (CSUR)

Quantified Score

Hi-index	0.00

Visualization

Abstract

We present a general approach for the hierarchical segmentation and labeling of document layout structures. This approach models document layout as a grammar and performs a global search for the optimal parse based on a grammatical cost function. Our contribution is to utilize machine learning to discriminatively select features and set all parameters in the parsing process. Therefore, and unlike many other approaches for layout analysis, ours can easily adapt itself to a variety of document analysis problems. One need only specify the page grammar and provide a set of correctly labeled pages. We apply this technique to two document image analysis tasks: page layout structure extraction and mathematical expression interpretation. Experiments demonstrate that the learned grammars can be used to extract the document structure in 57 files from the UWIII document image database. We also show that the same framework can be used to automatically interpret printed mathematical expressions so as to recreate the original LaTeX.