On the editing distance between unordered labeled trees
Information Processing Letters
XClust: clustering XML schemas for effective integration
Proceedings of the eleventh international conference on Information and knowledge management
An Information-Theoretic Definition of Similarity
ICML '98 Proceedings of the Fifteenth International Conference on Machine Learning
A bag of paths model for measuring structural similarity in Web documents
Proceedings of the ninth ACM SIGKDD international conference on Knowledge discovery and data mining
Information Systems - Special issue on web data integration
XML application schema matching using similarity measure and relaxation labeling
Information Sciences: an International Journal
Measuring similarity of semi-structured documents with context weights
SIGIR '06 Proceedings of the 29th annual international ACM SIGIR conference on Research and development in information retrieval
Fuzzy Similarity from Conceptual Relations
APSCC '06 Proceedings of the 2006 IEEE Asia-Pacific Conference on Services Computing
XML schema clustering with semantic and hierarchical similarity measures
Knowledge-Based Systems
Investigating Semantic Measures in XML Clustering
WI '06 Proceedings of the 2006 IEEE/WIC/ACM International Conference on Web Intelligence
An approach to XML path matching
Proceedings of the 9th annual ACM international workshop on Web information and data management
Hi-index | 0.00 |
With the widespread diffusion of semi-structured data in XML format, algorithms for mining information from XML documents are becoming increasingly important. So a similarity function is the key of a successful XML data management process. In this paper, we propose a new method to measure the similarity between XML documents by considering their structures and contents, which comprises three layer matching: element matching, path matching and document matching. The similarity of document's structure is found by partial matching technique and that of document's contents is found by taking into account of the syntactic information, semantic information and position of elements.