Ambiguity and constraint in mathematical expression recognition
AAAI '98/IAAI '98 Proceedings of the fifteenth national/tenth conference on Artificial intelligence/Innovative applications of artificial intelligence
Syntactic Segmentation and Labeling of Digitized Pages from Technical Journals
IEEE Transactions on Pattern Analysis and Machine Intelligence
Applying Compiler Techniques to Diagram Recognition
ICPR '02 Proceedings of the 16 th International Conference on Pattern Recognition (ICPR'02) Volume 3 - Volume 3
ACL '96 Proceedings of the 34th annual meeting on Association for Computational Linguistics
A survey of table recognition: Models, observations, transformations, and inferences
International Journal on Document Analysis and Recognition
Stacked dependency networks for layout document structuring
Proceedings of the 2008 ACM symposium on Applied computing
Hi-index | 0.00 |
Grammars are a powerful technique for modeling and extracting the structure of documents. One large challenge, however, is computational complexity. The computational cost of grammatical parsing is related to both the complexity of the input and the ambiguity of the grammar. For programming languages, where the terminals appear in a linear sequence and the grammar is unambiguous, parsing is O(N). For natural languages, which are linear yet have an ambiguous grammar, parsing is O(N3). For documents, where the terminals are arranged in two dimensions and the grammar is ambiguous, parsing time can be exponential in the number of terminals. In this paper we introduce (and unify) several types of geometrical data structures which can be used to significantly accelerate parsing time. Each data structure embodies a different geometrical constraint on the set of possible valid parses. These data structures are very general, in that they can be used by any type of grammatical model, and a wide variety of document understanding tasks, to limit the set of hypotheses examined and tested. Assuming a clean design for the parsing software, the same parsing framework can be tested with various geometric constraints to determine the most effective combination.