Clustering XML documents by structure

Authors:
Anna Lesniewska
Affiliations:
Institute of Computing Science, Poznan University of Technology, Poznan, Poland
Venue:
ADBIS'09 Proceedings of the 13th East European conference on Advances in Databases and Information Systems
Year:
2009

Citing 12
Cited 1

Mining frequent patterns by pattern-growth: methodology and implications

ACM SIGKDD Explorations Newsletter - Special issue on “Scalable data mining algorithms”
BitCube: A Three-Dimensional Bitmap Indexing for XML Documents

SSDBM '01 Proceedings of the 13th International Conference on Scientific and Statistical Database Management
An Efficient and Scalable Algorithm for Clustering XML Documents by Structure

IEEE Transactions on Knowledge and Data Engineering
Fast Detection of XML Structural Similarity

IEEE Transactions on Knowledge and Data Engineering
Finding Syntactic Similarities Between XML Documents

DEXA '06 Proceedings of the 17th International Conference on Database and Expert Systems Applications
Computing structural similarity of source XML schemas against domain XML schema

ADC '08 Proceedings of the nineteenth conference on Australasian database - Volume 75
A flexible structured-based representation for XML document mining

INEX'05 Proceedings of the 4th international conference on Initiative for the Evaluation of XML Retrieval
Sequential pattern mining for structure-based XML document classification

INEX'05 Proceedings of the 4th international conference on Initiative for the Evaluation of XML Retrieval
Transforming XML trees for efficient classification and clustering

INEX'05 Proceedings of the 4th international conference on Initiative for the Evaluation of XML Retrieval
XMine: a methodology for mining XML structure

APWeb'06 Proceedings of the 8th Asia-Pacific Web conference on Frontiers of WWW Research and Development
Clustering XML documents using structural summaries

EDBT'04 Proceedings of the 2004 international conference on Current Trends in Database Technology
Survey: An overview on XML similarity: Background, current trends and future directions

Computer Science Review

FXProj: a fuzzy XML documents projected clustering based on structure and content

ADMA'11 Proceedings of the 7th international conference on Advanced Data Mining and Applications - Volume Part I

Quantified Score

Hi-index	0.00

Visualization

Abstract

Clustering of XML documents is an important data mining method, the aim of which is the grouping of similar XML documents. The issue of clustering XML documents by structure is being considered in this paper. Two different and independent methods of clustering XML documents by structure are being proposed. The first method represents a set of XML documents as a set of labels. The second method introduces a new representation of a set of XML documents, which is called the SuperTree. In this paper, it is suggested that the proposed methods may improve the accuracy of XML clustering by structure. Such thesis is based on the tests, the aim of which is to assess advantages of the proposals, as conducted respectively on the heterogeneous and homogenous sets of data.