Simple fast algorithms for the editing distance between trees and related problems
SIAM Journal on Computing
Clustering transactions using large items
Proceedings of the eighth international conference on Information and knowledge management
ACM Computing Surveys (CSUR)
Evaluation of hierarchical clustering algorithms for document datasets
Proceedings of the eleventh international conference on Information and knowledge management
Xyleme: A Dynamic Warehouse for XML Data of the Web
IDEAS '01 Proceedings of the International Database Engineering & Applications Symposium
Information Systems - Special issue on web data integration
Fast Detection of XML Structural Similarity
IEEE Transactions on Knowledge and Data Engineering
Knowledge and Information Systems
XCLS: a fast and effective clustering algorithm for heterogenous XML documents
PAKDD'06 Proceedings of the 10th Pacific-Asia conference on Advances in Knowledge Discovery and Data Mining
XMine: a methodology for mining XML structure
APWeb'06 Proceedings of the 8th Asia-Pacific Web conference on Frontiers of WWW Research and Development
Clust-XPaths: clustering of XML paths
MLDM'11 Proceedings of the 7th international conference on Machine learning and data mining in pattern recognition
A flexible structured-based representation for XML document mining
INEX'05 Proceedings of the 4th international conference on Initiative for the Evaluation of XML Retrieval
Hi-index | 0.00 |
XCLS is a novel clustering algorithm to assemble heterogeneous XML documents by measuring their level similarity with a global criterion function. XCLS does not require the pair wise similarity to be computed between two individual documents, rather it measures the similarity at clustering level utilising the structural information of XML documents. Quality of the clustering solution depends on the calculation of the level similarity, and whether the level similarity can represent the documents’ structural similarity correctly. In this paper, we present the performance of XCLS for clustering the structural descriptions (ordered labeled trees) of XML documents. We have reported 5 sub-tasks corresponding to 5 corpuses as provided by the INEX 2005 document mining track.