Image Document Categorization Using Hidden Tree Markov Models and Structured Representations

Authors:
Michelangelo Diligenti;Paolo Frasconi;Marco Gori
Affiliations:
-;-;-
Venue:
ICAPR '01 Proceedings of the Second International Conference on Advances in Pattern Recognition
Year:
2001

Citing 5
Cited 0

Probabilistic reasoning in intelligent systems: networks of plausible inference

Probabilistic reasoning in intelligent systems: networks of plausible inference
Probabilistic independence networks for hidden Markov probability models

Neural Computation
Bayesian Networks for Data Mining

Data Mining and Knowledge Discovery
Modeling Documents for Structure Recognition Using Generalized N-Grams

ICDAR '97 Proceedings of the 4th International Conference on Document Analysis and Recognition
A general framework for adaptive processing of data structures

IEEE Transactions on Neural Networks

Quantified Score

Hi-index	0.00

Visualization

Abstract

Categorization is an important problem in image document processing and is often a preliminary step for solving subsequent tasks such as recognition, understanding, and information extraction. In this paper the problem is formulated in the framework of concept learning and each category corresponds to the set of image documents with similar physical structure. We propose a solution based on two algorithmic ideas. First, we transform the image document into a structured representation based on X-Y trees. Compared to "flat" or vector-based feature extraction techniques, structured representations allow us to preserve important relationships between image sub-constituents. Second, we introduce a novel probabilistic architecture that extends hidden Markov models for learning probability distributions defined on spaces of labeled trees.