Interactive exploration of fuzzy clusters using Neighborgrams

Authors:
Michael R. Berthold;Bernd Wiswedel;David E. Patterson
Affiliations:
Department of Computer and Information Science, University of Konstanz, Box M712, 78457 Konstanz, Germany and Data Analysis Research Lab, Tripos Inc., USA;Department of Computer and Information Science, University of Konstanz, Box M712, 78457 Konstanz, Germany and Data Analysis Research Lab, Tripos Inc., USA;Data Analysis Research Lab, Tripos Inc., USA
Venue:
Fuzzy Sets and Systems
Year:
2005

Citing 4
Cited 5

Probabilistic neural networks

Neural Networks
Machine learning, neural and statistical classification

Machine learning, neural and statistical classification
Information Visualization and Visual Data Mining

IEEE Transactions on Visualization and Computer Graphics
Intelligent data analysis

Intelligent data analysis

Optimization study with ligand-design interval rules

Journal of Intelligent & Fuzzy Systems: Applications in Engineering and Technology
Unsupervised minor prototype detection using an adaptive population partitioning algorithm

Pattern Recognition
Supervised learning in parallel universes using neighborgrams

IDA'11 Proceedings of the 10th international conference on Advances in intelligent data analysis X
Data analysis in the life sciences — sparking ideas —

PKDD'05 Proceedings of the 9th European conference on Principles and Practice of Knowledge Discovery in Databases
Data analysis in the life sciences — sparking ideas —

ECML'05 Proceedings of the 16th European conference on Machine Learning

Quantified Score

Hi-index	0.20

Visualization

Abstract

We describe an interactive method to generate a set of fuzzy clusters for classes of interest of a given, labeled data set. The presented method is therefore best suited for applications where the focus of analysis lies on a model for the minority class or for small to medium-sized data sets. The clustering algorithm creates one-dimensional models of the neighborhood for a set of patterns by constructing cluster candidates for each pattern of interest and then chooses the best subset of clusters that form a global model of the data. The accompanying visualization of these neighborhoods allows the user to interact with the clustering process by selecting, discarding, or fine-tuning potential cluster candidates. Clusters can be crisp or fuzzy and the latter leads to a substantial improvement of the classification accuracy. We demonstrate the performance of the underlying algorithm on several data sets from the StatLog project and show its usefulness for visual cluster exploration on the Iris data and a large molecular dataset from the National Cancer Institute.