Recent advances of grammatical inference
Theoretical Computer Science - Special issue on algorithmic learning theory
XTRACT: a system for extracting document type descriptors from XML documents
SIGMOD '00 Proceedings of the 2000 ACM SIGMOD international conference on Management of data
Structural inference for semistructured data
Proceedings of the tenth international conference on Information and knowledge management
Proceedings of the Second International Colloquium on Grammatical Inference and Applications
ICGI '94 Proceedings of the Second International Colloquium on Grammatical Inference and Applications
Learning Stochastic Regular Grammars by Means of a State Merging Method
ICGI '94 Proceedings of the Second International Colloquium on Grammatical Inference and Applications
Inductive Inference, DFAs, and Computational Complexity
AII '89 Proceedings of the International Workshop on Analogical and Inductive Inference
Hi-index | 0.00 |
This paper investigates methods to automatically infer structural information from large XML documents. Using XML as a reference format, we approach the schema generation problem by application of inductive inference theory. In doing so, we review and extend results relating to the search spaces of grammatical inferences for large data set. We evaluate the result of an inference process using the concept of Minimum Message Length. Comprehensive experimentation reveals our new hybrid method to be the most effective for large documents. Finally tractability issues, including scalability analysis, are discussed.