Equivalence queries and approximate fingerprints
COLT '89 Proceedings of the second annual workshop on Computational learning theory
Efficient identification of regular expressions from representative examples
COLT '93 Proceedings of the sixth annual conference on Computational learning theory
World Wide Web Journal - Special issue on XML: principles, tools, and techniques
XTRACT: a system for extracting document type descriptors from XML documents
SIGMOD '00 Proceedings of the 2000 ACM SIGMOD international conference on Management of data
DTD-Miner: A Tool for Mining DTD from XML Documents
WECWIS '00 Proceedings of the Second International Workshop on Advance Issues of E-Commerce and Web-Based Information Systems (WECWIS 2000)
Automatic web news extraction using tree edit distance
Proceedings of the 13th international conference on World Wide Web
Inference of concise DTDs from XML data
VLDB '06 Proceedings of the 32nd international conference on Very large data bases
Inferring XML schema definitions from XML data
VLDB '07 Proceedings of the 33rd international conference on Very large data bases
Output schemas of XSLT stylesheets and their applications
Information Sciences: an International Journal
Inference of concise regular expressions and DTDs
ACM Transactions on Database Systems (TODS)
Recovering data semantics from XML documents into DTD graph with SAX
ACOS'06 Proceedings of the 5th WSEAS international conference on Applied computer science
Dealing with large schema sets in mobile SOS-based applications
Proceedings of the 2nd International Conference on Computing for Geospatial Research & Applications
Instance-based XML data binding for mobile devices
Proceedings of the Third International Workshop on Middleware for Pervasive Mobile and Embedded Computing
Hi-index | 0.89 |
In this paper, we present a technique for efficient extraction of concise and accurate schemas for XML documents. By restricting the schema form and applying some heuristic rules, we achieve the efficiency and conciseness. The result of an experiment with real-life DTDs shows that our approach attains high accuracy and is 20 to 200 times faster than existing approaches.