Symbolic Learning Techniques in Paper Document Processing

Authors:
Oronzo Altamura;Floriana Esposito;Francesca A. Lisi;Donato Malerba
Affiliations:
-;-;-;-
Venue:
MLDM '99 Proceedings of the First International Workshop on Machine Learning and Data Mining in Pattern Recognition
Year:
1999

Citing 16
Cited 0

An Image Understanding System Using Attributed Symbolic Representation and Inexact Graph-Matching

IEEE Transactions on Pattern Analysis and Machine Intelligence
Generating and generalizing models of visual objects

Artificial Intelligence
Classification of newspaper image blocks using texture analysis

Computer Vision, Graphics, and Image Processing
A Prototype Document Image Analysis System for Technical Journals

Computer
The skew angle of printed documents

Document image analysis
A Comparative Analysis of Methods for Pruning Decision Trees

IEEE Transactions on Pattern Analysis and Machine Intelligence
Document Processing for Automatic Knowledge Acquisition

IEEE Transactions on Knowledge and Data Engineering
Incremental Induction of Decision Trees

Machine Learning
Representing OCRed documents in HTML

ICDAR '97 Proceedings of the 4th International Conference on Document Analysis and Recognition
Handling Continuous Data in Top-Down Induction of First-Order Rules

AI*IA '97 Proceedings of the 5th Congress of the Italian Association for Artificial Intelligence on Advances in Artificial Intelligence
Processing Paper Documents with WISDOM

AI*IA '97 Proceedings of the 5th Congress of the Italian Association for Artificial Intelligence on Advances in Artificial Intelligence
WISDOM++: An Interactive and Adaptive Document Analysis System

ICDAR '99 Proceedings of the Fifth International Conference on Document Analysis and Recognition
A knowledge-based approach to the layout analysis

ICDAR '95 Proceedings of the Third International Conference on Document Analysis and Recognition (Volume 1) - Volume 1
ANASTASIL: hybrid knowledge-based system for document layout analysis

IJCAI'89 Proceedings of the 11th international joint conference on Artificial intelligence - Volume 2
Document analysis system

IBM Journal of Research and Development
Adaptive document block segmentation and classification

IEEE Transactions on Systems, Man, and Cybernetics, Part B: Cybernetics

Quantified Score

Hi-index	0.00

Visualization

Abstract

WISDOM++ is an intelligent document processing system that transforms a paper document into HTML/XML format. The main design requirement is adaptivity, which is realized through the application of machine learning methods. This paper illustrates the application of symbolic learning algorithms to the first three steps of document processing, namely document analysis, document classification and document understanding. Machine learning issues related to the application are: Efficient incremental induction of decision trees from numeric data, handling of both numeric and symbolic data in first-order rule learning, learning mutually dependent concepts. Experimental results obtained on a set of real-world documents are illustrated and commented.