Multilevel Conditional Fuzzy C-Means Clustering of XML Documents

  • Authors:
  • Michal Kozielski

  • Affiliations:
  • Silesian University of Technology, Akademicka 16, 44-100 Gliwice,

  • Venue:
  • PKDD 2007 Proceedings of the 11th European conference on Principles and Practice of Knowledge Discovery in Databases
  • Year:
  • 2007

Quantified Score

Hi-index 0.00

Visualization

Abstract

XML documents are the special kind of data having hierarchical structure. Typical clustering algorithms do not meet requirements which may be stated for analysis of such data. A novel, dedicated for XML documents clustering method called Multilevel clustering of XML documents(ML) is presented in the paper. The method clusters feature vectors encoding XML documents on the different structure levels. Application of Conditional Fuzzy C-Meansalgorithm to MLmethod is proposed in the paper and the advantage of this fuzzy method over hard approach to MLalgorithm is discussed and proved. An application of MLmethod to accelerating query execution on XML documents is discussed in the paper. The experimental results performed on two data sets having different characteristics show that the proposed method of multilevel conditional fuzzy clustering of XML documents outperforms hard multilevel clustering.