Symbolic Learning Techniques in Paper Document Processing

  • Authors:
  • Oronzo Altamura;Floriana Esposito;Francesca A. Lisi;Donato Malerba

  • Affiliations:
  • -;-;-;-

  • Venue:
  • MLDM '99 Proceedings of the First International Workshop on Machine Learning and Data Mining in Pattern Recognition
  • Year:
  • 1999

Quantified Score

Hi-index 0.00

Visualization

Abstract

WISDOM++ is an intelligent document processing system that transforms a paper document into HTML/XML format. The main design requirement is adaptivity, which is realized through the application of machine learning methods. This paper illustrates the application of symbolic learning algorithms to the first three steps of document processing, namely document analysis, document classification and document understanding. Machine learning issues related to the application are: Efficient incremental induction of decision trees from numeric data, handling of both numeric and symbolic data in first-order rule learning, learning mutually dependent concepts. Experimental results obtained on a set of real-world documents are illustrated and commented.