Frequent Subtree Mining - An Overview
Fundamenta Informaticae - Advances in Mining Graphs, Trees and Sequences
Investigating Semantic Measures in XML Clustering
WI '06 Proceedings of the 2006 IEEE/WIC/ACM International Conference on Web Intelligence
Xproj: a framework for projected structural clustering of xml documents
Proceedings of the 13th ACM SIGKDD international conference on Knowledge discovery and data mining
PCITMiner: prefix-based closed induced tree miner for finding closed induced frequent subtrees
AusDM '07 Proceedings of the sixth Australasian conference on Data mining and analytics - Volume 70
A methodology for clustering XML documents by structure
Information Systems
Clustering XML documents based on structural similarity
DASFAA'07 Proceedings of the 12th international conference on Database systems for advanced applications
Document Clustering Using Incremental and Pairwise Approaches
Focused Access to XML Documents
Collaborative clustering of XML documents
Journal of Computer and System Sciences
Discovering interesting information with advances in web technology
ACM SIGKDD Explorations Newsletter
Hi-index | 0.00 |
This paper presents the experimental study conducted over the INEX 2007 Document Mining Challenge corpus employing a frequent subtree-based incremental clustering approach. Using the structural information of the XML documents, the closed frequent subtrees are generated. A matrix is then developed representing the closed frequent subtree distribution in documents. This matrix is used to progressively cluster the XML documents. In spite of the large number of documents in INEX 2007 Wikipedia dataset, the proposed frequent subtree-based incremental clustering approach was successful in clustering the documents.