Semantic Structural Similarity Measure for Clustering XML Documents

  • Authors:
  • Ling Song;Jun Ma;Jingsheng Lei;Dongmei Zhang;Zhen Wang

  • Affiliations:
  • School of Computer Science &Technology, Shandong University, China 250101 and School of Computer Science & Technology, Shandong Jianzhu University, China 250101;School of Computer Science &Technology, Shandong University, China 250101;College of Computer, Nanjing University of Posts and Telecommunications, Nanjing, China 210003;School of Computer Science &Technology, Shandong University, China 250101 and School of Computer Science & Technology, Shandong Jianzhu University, China 250101;School of Computer Science & Technology, Shandong Jianzhu University, China 250101

  • Venue:
  • WISM '09 Proceedings of the International Conference on Web Information Systems and Mining
  • Year:
  • 2009

Quantified Score

Hi-index 0.00

Visualization

Abstract

Clustering XML documents semantically has become a major challenge in XML data managements. The key research issue is to find the similarity functions of XML documents. However, previous work gave more importance to the topology structure than to the semantic information. In this paper, the computation of similarity between two XML documents is based on both structural and semantic information. Then a minimal spanning tree clustering method is used to cluster XML documents. The experiment results show that the new method performs better than baseline similarity measure in terms of purity and rand index.