A survey of information retrieval and filtering methods
A survey of information retrieval and filtering methods
Computational experience on four algorithms for the hard clustering problem
Pattern Recognition Letters
Data on the Web: from relations to semistructured data and XML
Data on the Web: from relations to semistructured data and XML
On effective multi-dimensional indexing for strings
SIGMOD '00 Proceedings of the 2000 ACM SIGMOD international conference on Management of data
XClust: clustering XML schemas for effective integration
Proceedings of the eleventh international conference on Information and knowledge management
Clustering DTDs: an interactive two-level approach
Journal of Computer Science and Technology
A tree-based approach to clustering XML documents by structure
PKDD '04 Proceedings of the 8th European Conference on Principles and Practice of Knowledge Discovery in Databases
K-Means-Type Algorithms: A Generalized Convergence Theorem and Characterization of Local Optimality
IEEE Transactions on Pattern Analysis and Machine Intelligence
Hi-index | 0.00 |
Clustering is able to facilitate Information Retrieval. This paper addresses the issue of clustering a large number of XML documents. We propose ICX algorithm with a novel similarity metric based on quantitative path. In our approach, each document is firstly represented by path sequences extracted from XML trees. Then these sequences are mapped into quantitative path, by which the distance between documents can be computed with low complexity. Finally, the desired clusters are constructed by utilizing ICX method with literal local search. Experimental results, based on XML documents obtained from DBLP, show the effectiveness and good performance of the proposed techniques.