XML schema clustering with semantic and hierarchical similarity measures

Authors:
Richi Nayak;Wina Iryadi
Affiliations:
School of Information Systems, Queensland University of Technology, Brisbane, Qld., 4001, Australia;School of Information Systems, Queensland University of Technology, Brisbane, Qld., 4001, Australia
Venue:
Knowledge-Based Systems
Year:
2007

Citing 23
Cited 21

Simple fast algorithms for the editing distance between trees and related problems

SIAM Journal on Computing
Data on the Web: from relations to semistructured data and XML

Data on the Web: from relations to semistructured data and XML
Reconciling schemas of disparate data sources: a machine-learning approach

SIGMOD '01 Proceedings of the 2001 ACM SIGMOD international conference on Management of data
Induction of integrated view for XML data with heterogeneous DTDs

Proceedings of the tenth international conference on Information and knowledge management
XClust: clustering XML schemas for effective integration

Proceedings of the eleventh international conference on Information and knowledge management
Evaluation of hierarchical clustering algorithms for document datasets

Proceedings of the eleventh international conference on Information and knowledge management
Mining Sequential Patterns: Generalizations and Performance Improvements

EDBT '96 Proceedings of the 5th International Conference on Extending Database Technology: Advances in Database Technology
Generic Schema Matching with Cupid

Proceedings of the 27th International Conference on Very Large Data Bases
Xyleme: A Dynamic Warehouse for XML Data of the Web

IDEAS '01 Proceedings of the International Database Engineering & Applications Symposium
Adding Relevance to XML

Selected papers from the Third International Workshop WebDB 2000 on The World Wide Web and Databases
eXist: An Open Source Native XML Database

Revised Papers from the NODe 2002 Web and Database-Related Workshops on Web, Web-Services, and Database Systems
Similarity Flooding: A Versatile Graph Matching Algorithm and Its Application to Schema Matching

ICDE '02 Proceedings of the 18th International Conference on Data Engineering
A matching algorithm for measuring the structural similarity between an XML document and a DTD and its applications

Information Systems - Special issue on web data integration
Fast Detection of XML Structural Similarity

IEEE Transactions on Knowledge and Data Engineering
On the use of hierarchical information in sequential mining-based XML document similarity computation

Knowledge and Information Systems
Finding an optimum edit script between an XML document and a DTD

Proceedings of the 2005 ACM symposium on Applied computing
Peer-to-peer management of XML data: issues and research challenges

ACM SIGMOD Record
Schema matching for transforming structured documents

Proceedings of the 2005 ACM symposium on Document engineering
Frequent Subtree Mining - An Overview

Fundamenta Informaticae - Advances in Mining Graphs, Trees and Sequences
COMA: a system for flexible combination of schema matching approaches

VLDB '02 Proceedings of the 28th international conference on Very Large Data Bases
An Automatic Semantic Extraction Algorithm for XML Document

MVHI '10 Proceedings of the 2010 International Conference on Machine Vision and Human-machine Interface
Finding maximal similar paths between XML documents using sequential patterns

ADVIS'04 Proceedings of the Third international conference on Advances in Information Systems
XCLS: a fast and effective clustering algorithm for heterogenous XML documents

PAKDD'06 Proceedings of the 10th Pacific-Asia conference on Advances in Knowledge Discovery and Data Mining

A schema matching-based approach to XML schema clustering

Proceedings of the 10th International Conference on Information Integration and Web-based Applications & Services
Similarity Evaluation of XML Documents Based on Weighted Element Tree Model

ADMA '09 Proceedings of the 5th International Conference on Advanced Data Mining and Applications
Computational complexity of schema matching approaches

MAMECTIS'09 Proceedings of the 11th WSEAS international conference on Mathematical methods, computational techniques and intelligent systems
Optimization and comparison of schema matching solutions

MAMECTIS'09 Proceedings of the 11th WSEAS international conference on Mathematical methods, computational techniques and intelligent systems
Computational requirement of schema matching algorithms

WSEAS Transactions on Information Science and Applications
Semantic Structural Similarity Measure for Clustering XML Documents

WISM '09 Proceedings of the International Conference on Web Information Systems and Mining
Calibration and comparison of schema matchers

WSEAS Transactions on Mathematics
A methodology for clustering XML documents based on labeled tree

FSKD'09 Proceedings of the 6th international conference on Fuzzy systems and knowledge discovery - Volume 1
An approach for measuring similarity between XML documents

FSKD'09 Proceedings of the 6th international conference on Fuzzy systems and knowledge discovery - Volume 7
Element similarity measures in XML schema matching

Information Sciences: an International Journal
Ontology-based information content computation

Knowledge-Based Systems
XML data clustering: An overview

ACM Computing Surveys (CSUR)
Clust-XPaths: clustering of XML paths

MLDM'11 Proceedings of the 7th international conference on Machine learning and data mining in pattern recognition
Semantics-based web service discovery using information retrieval techniques

INEX'10 Proceedings of the 9th international conference on Initiative for the evaluation of XML retrieval: comparative evaluation of focused retrieval
Automatic library migration for the generation of hardware-in-the-loop models

Science of Computer Programming
Evaluating PageRank methods for structural sense ranking in labeled tree data

Proceedings of the 2nd International Conference on Web Intelligence, Mining and Semantics
Minimizing user effort in XML grammar matching

Information Sciences: an International Journal
Short Communication: S-Trans: Semantic transformation of XML healthcare data into OWL ontology

Knowledge-Based Systems
Exploring dictionary-based semantic relatedness in labeled tree data

Information Sciences: an International Journal
Structural similarity evaluation of XML documents based on basic statistics

WISM'12 Proceedings of the 2012 international conference on Web Information Systems and Mining
Discovering interesting information with advances in web technology

ACM SIGKDD Explorations Newsletter

Quantified Score

Hi-index	0.00

Visualization

Abstract

With the growing popularity of XML as the data representation language, collections of the XML data are exploded in numbers. The methods are required to manage and discover the useful information from them for the improved document handling. We present a schema clustering process by organising the heterogeneous XML schemas into various groups. The methodology considers not only the linguistic and the context of the elements but also the hierarchical structural similarity. We support our findings with experiments and analysis.