WordNet: a lexical database for English
Communications of the ACM
Extracting schema from semistructured data
SIGMOD '98 Proceedings of the 1998 ACM SIGMOD international conference on Management of data
Discovering typical structures of documents: a road map approach
Proceedings of the 21st annual international ACM SIGIR conference on Research and development in information retrieval
XTRACT: a system for extracting document type descriptors from XML documents
SIGMOD '00 Proceedings of the 2000 ACM SIGMOD international conference on Management of data
Re-engineering structures from Web documents
DL '00 Proceedings of the fifth ACM conference on Digital libraries
Data mining: concepts and techniques
Data mining: concepts and techniques
ICDT '97 Proceedings of the 6th International Conference on Database Theory
ISICT '03 Proceedings of the 1st international symposium on Information and communication technologies
Information Systems - Special issue on web data integration
Temporal modelling and management of normative documents in XML format
Data & Knowledge Engineering - Special issue: WIDM 2003
Impact of XML schema evolution on valid documents
Proceedings of the 7th annual ACM international workshop on Web information and data management
DTD-Diff: A change detection algorithm for DTDs
Data & Knowledge Engineering
XML: some papers in a haystack
ACM SIGMOD Record
On inference of XML schema with the knowledge of an obsolete one
ADC '09 Proceedings of the Twentieth Australasian Conference on Australasian Database - Volume 92
Diχeminator: a profile-based selective dissemination system for XML documents
EDBT'04 Proceedings of the 2004 international conference on Current Trends in Database Technology
XML schema evolution: incremental validation and efficient document adaptation
XSym'07 Proceedings of the 5th international conference on Database and XML Technologies
Hi-index | 0.00 |
In this paper we address the problem of evolving a set of DTDs so to obtain a description as precise as possible of the structures of the documents actually stored in a source of XML documents. This problem is highly relevant in such a dynamic and heterogeneous environment as the Web. The approach we propose relies on the use of a classification mechanism based on document structure and on the use of data mining association rules to find out frequent structural patterns in data.