Introduction to algorithms
Lore: a database management system for semistructured data
ACM SIGMOD Record
Wrapper generation for semi-structured Internet sources
ACM SIGMOD Record
Pattern matching algorithms
SIGMOD '98 Proceedings of the 1998 ACM SIGMOD international conference on Management of data
Discovering typical structures of documents: a road map approach
Proceedings of the 21st annual international ACM SIGIR conference on Research and development in information retrieval
Object Exchange Across Heterogeneous Information Sources
ICDE '95 Proceedings of the Eleventh International Conference on Data Engineering
DataGuides: Enabling Query Formulation and Optimization in Semistructured Databases
VLDB '97 Proceedings of the 23rd International Conference on Very Large Data Bases
Evolving a Set of DTDs According to a Dynamic Set of XML Documents
EDBT '02 Proceedings of the Worshops XMLDM, MDDE, and YRWS on XML-Based Data Management and Multimedia Engineering-Revised Papers
A Model for XML Schema Integration
EC-WEB '02 Proceedings of the Third International Conference on E-Commerce and Web Technologies
Extracting Information from XML Documents by Reverse Generating a DTD
EurAsia-ICT '02 Proceedings of the First EurAsian Conference on Information and Communication Technology
XML schema integration to facilitate E-commerce
Web-enabled systems integration
Measuring the structural similarity among XML documents and DTDs
Journal of Intelligent Information Systems
An XML Schema integration and query mechanism system
Data & Knowledge Engineering
Discovering XML keys and foreign keys in queries
Proceedings of the 2009 ACM symposium on Applied Computing
Towards inference of more realistic XSDs
Proceedings of the 2009 ACM symposium on Applied Computing
DASFAA'08 Proceedings of the 13th international conference on Database systems for advanced applications
On inference of XML schema with the knowledge of an obsolete one
ADC '09 Proceedings of the Twentieth Australasian Conference on Australasian Database - Volume 92
Document resizing for visually impaired students
Proceedings of the 22nd Conference of the Computer-Human Interaction Special Interest Group of Australia on Computer-Human Interaction
Hi-index | 0.00 |
To realize a wide range of applications (including digital libraries) on the Web, a more structured way of accessing the Web is required and such requirement can be facilitated by the use of XML standard. In this paper, we propose a general framework for reverse engineering (or re-engineering) the underlying structures i.e.,the DTD from a collection of similarly structured XML documents when they share some common but unknown DTDs. The essential data structures and algorithms for the DTD generation have been delveloped and experiments on real Web collections have been conducted to demonstrate their feasibilty. In addition, we also proposed a method ofimposing a constraint on the repetitiveness on the element in a DTD rule to further simplify the generated DTD without compromising their correctness.