An Empirical Study of Categorical Dataset Visualization Using a Simulated Bee Colony Clustering Algorithm

  • Authors:
  • James D. Mccaffrey

  • Affiliations:
  • Microsoft MSDN / Volt VTE, One Microsoft Way, Redmond, USA 98052

  • Venue:
  • ISVC '09 Proceedings of the 5th International Symposium on Advances in Visual Computing: Part I
  • Year:
  • 2009

Quantified Score

Hi-index 0.00

Visualization

Abstract

This study investigates the use of a biologically inspired meta-heuristic algorithm to cluster categorical datasets so that the data can be presented in a useful visual form. A computer program which implemented the algorithm was executed against a benchmark dataset of voting records and produced better results, in terms of cluster accuracy, than all known published studies. Compared to alternative clustering and visualization approaches, the categorical dataset clustering with a simulated bee colony (CDC-SBC) algorithm has the advantage of allowing arbitrarily large datasets to be analyzed. The primary disadvantages of the CDC-SBC algorithm for dataset clustering and visualization are that the approach requires a relatively large number of input parameters, and that the approach does not guarantee convergence to an optimal solution. The results of this study suggest that using the CDC-SBC approach for categorical data visualization may be both practical and useful in certain scenarios.