A comparative analysis of methodologies for database schema integration
ACM Computing Surveys (CSUR)
Simple fast algorithms for the editing distance between trees and related problems
SIAM Journal on Computing
On the editing distance between unordered labeled trees
Information Processing Letters
Approximate tree matching in the presence of variable length don't cares
Journal of Algorithms
WordNet: a lexical database for English
Communications of the ACM
Change detection in hierarchically structured information
SIGMOD '96 Proceedings of the 1996 ACM SIGMOD international conference on Management of data
Meaningful change detection in structured data
SIGMOD '97 Proceedings of the 1997 ACM SIGMOD international conference on Management of data
Pattern matching algorithms
Extracting schema from semistructured data
SIGMOD '98 Proceedings of the 1998 ACM SIGMOD international conference on Management of data
Issues and approaches of database integration
Communications of the ACM
Discovering typical structures of documents: a road map approach
Proceedings of the 21st annual international ACM SIGIR conference on Research and development in information retrieval
Storing semistructured data with STORED
SIGMOD '99 Proceedings of the 1999 ACM SIGMOD international conference on Management of data
The Tree-to-Tree Correction Problem
Journal of the ACM (JACM)
XTRACT: a system for extracting document type descriptors from XML documents
SIGMOD '00 Proceedings of the 2000 ACM SIGMOD international conference on Management of data
Re-engineering structures from Web documents
DL '00 Proceedings of the fifth ACM conference on Digital libraries
Reconciling schemas of disparate data sources: a machine-learning approach
SIGMOD '01 Proceedings of the 2001 ACM SIGMOD international conference on Management of data
XIRQL: a query language for information retrieval in XML documents
Proceedings of the 24th annual international ACM SIGIR conference on Research and development in information retrieval
An expressive and efficient language for XML information retrieval
Journal of the American Society for Information Science and Technology - XML
XClust: clustering XML schemas for effective integration
Proceedings of the eleventh international conference on Information and knowledge management
Protection and administration of XML data sources
Data & Knowledge Engineering - Data and applications security
Generic Schema Matching with Cupid
Proceedings of the 27th International Conference on Very Large Data Bases
XBench - A Family of Benchmarks for XML DBMSs
Proceedings of the VLDB 2002 Workshop EEXTT and CAiSE 2002 Workshop DTWeb on Efficiency and Effectiveness of XML Tools and Techniques and Data Integration over the Web-Revised Papers
Selected papers from the Third International Workshop WebDB 2000 on The World Wide Web and Databases
A New Editing based Distance between Unordered Labeled Trees
CPM '93 Proceedings of the 4th Annual Symposium on Combinatorial Pattern Matching
Comparison of Schema Matching Evaluations
Revised Papers from the NODe 2002 Web and Database-Related Workshops on Web, Web-Services, and Database Systems
A survey of approaches to automatic schema matching
The VLDB Journal — The International Journal on Very Large Data Bases
WWW '03 Proceedings of the 12th international conference on World Wide Web
Detecting Changes in XML Documents
ICDE '02 Proceedings of the 18th International Conference on Data Engineering
Information Systems - Special issue on web data integration
COMA: a system for flexible combination of schema matching approaches
VLDB '02 Proceedings of the 28th international conference on Very Large Data Bases
Diχeminator: a profile-based selective dissemination system for XML documents
EDBT'04 Proceedings of the 2004 international conference on Current Trends in Database Technology
A schema matching-based approach to XML schema clustering
Proceedings of the 10th International Conference on Information Integration and Web-based Applications & Services
Multimedia metadata mapping: towards helping developers in their integration task
Proceedings of the 8th International Conference on Advances in Mobile Computing and Multimedia
XML data clustering: An overview
ACM Computing Surveys (CSUR)
Approximate top-k structural similarity search over XML documents
APWeb'06 Proceedings of the 8th Asia-Pacific Web conference on Frontiers of WWW Research and Development
Conservative type extensions for XML data
Transactions on Large-Scale Data- and Knowledge-centered systems IX
Hi-index | 0.00 |
Measuring the structural similarity between an XML document and a DTD has many relevant applications that range from document classification and approximate structural queries on XML documents to selective dissemination of XML documents and document protection. The problem is harder than measuring structural similarity among documents, because a DTD can be considered as a generator of documents. Thus, the problem is to evaluate the similarity between a document and a set of documents. An effective structural similarity measure should face different requirements that range from considering the presence and absence of required elements, as well as the structure and level of the missing and extra elements to vocabulary discrepancies due to the use of synonymous or syntactically similar tags. In the paper, starting from these requirements, we provide a definition of the measure and present an algorithm for matching a document against a DTD to obtain their structural similarity. Finally, experimental results to assess the effectiveness of the approach are presented.