The Journal of Machine Learning Research
Substructure discovery using minimum description length and background knowledge
Journal of Artificial Intelligence Research
Hi-index | 0.00 |
In this paper we present a system, DoLSuD, for the automatic discovery of relevant substructures in a document layout. DoL-SuD, Document Layout Substructure Discovery, extracts, analyzes and describes the visual content of structured documents, such as catalogs, in order to discover repeating and distinctive substructures in the document layout and to establish relations between textual and image content. The paper presents the system along with experimental results and the web based service which utilizes the analysis results.