An Empirical Study of Categorical Dataset Visualization Using a Simulated Bee Colony Clustering Algorithm

Authors:
James D. Mccaffrey
Affiliations:
Microsoft MSDN / Volt VTE, One Microsoft Way, Redmond, USA 98052
Venue:
ISVC '09 Proceedings of the 5th International Symposium on Advances in Visual Computing: Part I
Year:
2009

Citing 11
Cited 0

Reinterpreting the Category Utility Function

Machine Learning
COOLCAT: an entropy-based algorithm for categorical clustering

Proceedings of the eleventh international conference on Information and knowledge management
Knowledge Acquisition Via Incremental Conceptual Clustering

Machine Learning
Transportation Modeling: An Artificial Life Approach

ICTAI '02 Proceedings of the 14th IEEE International Conference on Tools with Artificial Intelligence
A Taxonomy of Visualization Techniques Using the Data State Reference Model

INFOVIS '00 Proceedings of the IEEE Symposium on Information Vizualization 2000
On Honey Bees and Dynamic Server Allocation in Internet Hosting Centers

Adaptive Behavior - Animals, Animats, Software Agents, Robots, Adaptive Systems
A k-mean clustering algorithm for mixed numeric and categorical data

Data & Knowledge Engineering
Hierarchical clustering of mixed data based on distance hierarchy

Information Sciences: an International Journal
An Incremental Algorithm for Clustering Search Results

SITIS '08 Proceedings of the 2008 IEEE International Conference on Signal Image Technology and Internet Based Systems
Generation of pairwise test sets using a simulated bee colony algorithm

IRI'09 Proceedings of the 10th IEEE international conference on Information Reuse & Integration
Cooperative bees swarm for solving the maximum weighted satisfiability problem

IWANN'05 Proceedings of the 8th international conference on Artificial Neural Networks: computational Intelligence and Bioinspired Systems

Quantified Score

Hi-index	0.00

Visualization

Abstract

This study investigates the use of a biologically inspired meta-heuristic algorithm to cluster categorical datasets so that the data can be presented in a useful visual form. A computer program which implemented the algorithm was executed against a benchmark dataset of voting records and produced better results, in terms of cluster accuracy, than all known published studies. Compared to alternative clustering and visualization approaches, the categorical dataset clustering with a simulated bee colony (CDC-SBC) algorithm has the advantage of allowing arbitrarily large datasets to be analyzed. The primary disadvantages of the CDC-SBC algorithm for dataset clustering and visualization are that the approach requires a relatively large number of input parameters, and that the approach does not guarantee convergence to an optimal solution. The results of this study suggest that using the CDC-SBC approach for categorical data visualization may be both practical and useful in certain scenarios.