A new approach for cluster detection for large datasets with high dimensionality

  • Authors:
  • Matthew Gebski;Raymond K. Wong

  • Affiliations:
  • National ICT Australia and School of Computer Science & Engineering, University of New South Wales, Sydney, NSW, Australia;National ICT Australia and School of Computer Science & Engineering, University of New South Wales, Sydney, NSW, Australia

  • Venue:
  • DaWaK'05 Proceedings of the 7th international conference on Data Warehousing and Knowledge Discovery
  • Year:
  • 2005

Quantified Score

Hi-index 0.00

Visualization

Abstract

The study of the use of computers through human computer interfaces (HCI) is essential to improve the productivity in any computer application environment. HCI analysts use a number of techniques to build models that are faithful to actual computer use. A key technique is through eye tracking, in which the region of the screen being examined is recorded in order to determine key areas of use. Clustering techniques allow these regions to be grouped to help facilitate usability analysis. Historically, approaches such as the Expectation Maximization (EM) and K-Means algorithm have performed well. Unfortunately, these approaches require the number of clusters k to be known beforehand – in many real world situations, this hampers the effectiveness of the analysis of the data. We propose a novel algorithm that is well suited for cluster discovery for HCI data; we do not require the number of clusters to be specified a priori and our approach scales very well for both large datasets and high dimensionality. Experiments have demonstrated that our approach works well for real data from HCI applications.