FXProj: a fuzzy XML documents projected clustering based on structure and content

Authors:
Tengfei Ji;Xiaoyuan Bao;Dongqing Yang
Affiliations:
Peking University, Beijing, China;Peking University, Beijing, China;Peking University, Beijing, China
Venue:
ADMA'11 Proceedings of the 7th international conference on Advanced Data Mining and Applications - Volume Part I
Year:
2011

Citing 14
Cited 0

Term-weighting approaches in automatic text retrieval

Information Processing and Management: an International Journal
The XML web: a first study

WWW '03 Proceedings of the 12th international conference on World Wide Web
Universal Text Preprocessing for Data Compression

IEEE Transactions on Computers
Xproj: a framework for projected structural clustering of xml documents

Proceedings of the 13th ACM SIGKDD international conference on Knowledge discovery and data mining
Document Clustering Using Incremental and Pairwise Approaches

Focused Access to XML Documents
HCX: an efficient hybrid clustering approach for XML documents

Proceedings of the 9th ACM symposium on Document engineering
XCFS: an XML documents clustering approach using both the structure and the content

Proceedings of the 18th ACM conference on Information and knowledge management
A weighted common structure based clustering technique for XML documents

Journal of Systems and Software
Online structural graph clustering using frequent subgraph mining

ECML PKDD'10 Proceedings of the 2010 European conference on Machine learning and knowledge discovery in databases: Part III
Structure and content similarity for clustering XML documents

WAIM'10 Proceedings of the 2010 international conference on Web-age information management
XML Documents Clustering Using Tensor Space Model -- A Preliminary Study

ICDMW '10 Proceedings of the 2010 IEEE International Conference on Data Mining Workshops
Clustering XML documents using structural summaries

EDBT'04 Proceedings of the 2004 international conference on Current Trends in Database Technology
Clustering XML documents by structure

ADBIS'09 Proceedings of the 13th East European conference on Advances in Databases and Information Systems
Survey: An overview on XML similarity: Background, current trends and future directions

Computer Science Review

Quantified Score

Hi-index	0.00

Visualization

Abstract

XML documents possess inherent semi-structured property, consisting of structural and content features. Most existing methods for XML documents clustering consider only one aspect of them. In this paper, we propose a fuzzy XML documents projected clustering algorithm, which can be used to cluster XML documents efficiently by combining the structural and content features. Another contribution is the adoption of some fuzzy techniques in a way that each frequent induced substructure has a fuzzy parameter associated with each cluster. Experimental results on both synthetic and real datasets show its effectiveness, especially when applying to large schemaless XML document collections.