Producing accurate interpretable clusters from high-dimensional data

  • Authors:
  • Derek Greene;Pádraig Cunningham

  • Affiliations:
  • University of Dublin, Trinity College, Dublin 2, Ireland;University of Dublin, Trinity College, Dublin 2, Ireland

  • Venue:
  • PKDD'05 Proceedings of the 9th European conference on Principles and Practice of Knowledge Discovery in Databases
  • Year:
  • 2005

Quantified Score

Hi-index 0.00

Visualization

Abstract

The primary goal of cluster analysis is to produce clusters that accurately reflect the natural groupings in the data. A second objective is to identify features that are descriptive of the clusters. In addition to these requirements, we often wish to allow objects to be associated with more than one cluster. In this paper we present a technique, based on the spectral co-clustering model, that is effective in meeting these objectives. Our evaluation on a range of text clustering problems shows that the proposed method yields accuracy superior to that afforded by existing techniques, while producing cluster descriptions that are amenable to human interpretation.