Clustering XML documents using self-organizing maps for structures

  • Authors:
  • M. Hagenbuchner;A. Sperduti;A. C. Tsoi;F. Trentini;F. Scarselli;M. Gori

  • Affiliations:
  • University of Wollongong, Wollongong, Australia;University of Padova, Padova, Italy;Monash University, Melbourne, Australia;University of Siena, Siena, Italy;University of Siena, Siena, Italy;University of Siena, Siena, Italy

  • Venue:
  • INEX'05 Proceedings of the 4th international conference on Initiative for the Evaluation of XML Retrieval
  • Year:
  • 2005

Quantified Score

Hi-index 0.00

Visualization

Abstract

Self-Organizing Maps capable of encoding structured information will be used for the clustering of XML documents. Documents formatted in XML are appropriately represented as graph data structures. It will be shown that the Self-Organizing Maps can be trained in an unsupervised fashion to group XML structured data into clusters, and that this task is scaled in linear time with increasing size of the corpus. It will also be shown that some simple prior knowledge of the data structures is beneficial to the efficient grouping of the XML documents.