Term-weighting approaches in automatic text retrieval
Information Processing and Management: an International Journal
WWW '03 Proceedings of the 12th international conference on World Wide Web
Universal Text Preprocessing for Data Compression
IEEE Transactions on Computers
Xproj: a framework for projected structural clustering of xml documents
Proceedings of the 13th ACM SIGKDD international conference on Knowledge discovery and data mining
Document Clustering Using Incremental and Pairwise Approaches
Focused Access to XML Documents
HCX: an efficient hybrid clustering approach for XML documents
Proceedings of the 9th ACM symposium on Document engineering
XCFS: an XML documents clustering approach using both the structure and the content
Proceedings of the 18th ACM conference on Information and knowledge management
A weighted common structure based clustering technique for XML documents
Journal of Systems and Software
Online structural graph clustering using frequent subgraph mining
ECML PKDD'10 Proceedings of the 2010 European conference on Machine learning and knowledge discovery in databases: Part III
Structure and content similarity for clustering XML documents
WAIM'10 Proceedings of the 2010 international conference on Web-age information management
XML Documents Clustering Using Tensor Space Model -- A Preliminary Study
ICDMW '10 Proceedings of the 2010 IEEE International Conference on Data Mining Workshops
Clustering XML documents using structural summaries
EDBT'04 Proceedings of the 2004 international conference on Current Trends in Database Technology
Clustering XML documents by structure
ADBIS'09 Proceedings of the 13th East European conference on Advances in Databases and Information Systems
Survey: An overview on XML similarity: Background, current trends and future directions
Computer Science Review
Hi-index | 0.00 |
XML documents possess inherent semi-structured property, consisting of structural and content features. Most existing methods for XML documents clustering consider only one aspect of them. In this paper, we propose a fuzzy XML documents projected clustering algorithm, which can be used to cluster XML documents efficiently by combining the structural and content features. Another contribution is the adoption of some fuzzy techniques in a way that each frequent induced substructure has a fuzzy parameter associated with each cluster. Experimental results on both synthetic and real datasets show its effectiveness, especially when applying to large schemaless XML document collections.