Investigating diversity of clustering methods: An empirical comparison

Authors:
Roy Gelbard;Orit Goldman;Israel Spiegler
Affiliations:
Information Systems Program, Graduate School of Business Administration, Bar-Ilan University, Ramat-Gan 52900, Israel;Technology and Information Systems Program, The Recanati Graduate School of Business Administration, Tel Aviv University, Tel Aviv 69978, Israel;Technology and Information Systems Program, The Recanati Graduate School of Business Administration, Tel Aviv University, Tel Aviv 69978, Israel
Venue:
Data & Knowledge Engineering
Year:
2007

Citing 9
Cited 9

Storage and retrieval considerations of binary data bases

Information Processing and Management: an International Journal
A general theory of discrimination learning

Production system models of learning and development
Unified theories of cognition

Unified theories of cognition
Modeling Cognitive Development on Balance Scale Phenomena

Machine Learning - Special issue on computational models of human learning
Data clustering: a review

ACM Computing Surveys (CSUR)
Hempel's raven paradox: a positive approach to cluster analysis

Computers and Operations Research
A Probabilistic Classification System for Predicting the Cellular Localization Sites of Proteins

Proceedings of the Fourth International Conference on Intelligent Systems for Molecular Biology
The reduced nearest neighbor rule (Corresp.)

IEEE Transactions on Information Theory
Nosing Around the Neighborhood: A New System Structure and Classification Rule for Recognition in Partially Exposed Environments

IEEE Transactions on Pattern Analysis and Machine Intelligence

Visualization of multi-algorithm clustering for better economic decisions - The case of car pricing

Decision Support Systems
Privacy-preserving data publishing for cluster analysis

Data & Knowledge Engineering
Cluster analysis using multi-algorithm voting in cross-cultural studies

Expert Systems with Applications: An International Journal
Classification by clustering decision tree-like classifier based on adjusted clusters

Expert Systems with Applications: An International Journal
Classification by clustering decision tree-like classifier based on adjusted clusters

Expert Systems with Applications: An International Journal
Adjusting Fuzzy Similarity Functions for use with standard data mining tools

Journal of Systems and Software
Strong fuzzy c-means in medical image data analysis

Journal of Systems and Software
A decision support method, based on bounded rationality concepts, to reveal feature saliency in clustering problems

Decision Support Systems
"Padding" bitmaps to support similarity and mining

Information Systems Frontiers

Quantified Score

Hi-index	0.00

Visualization

Abstract

The paper aims to shed some light on the question why clustering algorithms, despite being quantitative and hence supposedly objective in nature, yield different and varied results. To do that, we took 10 common clustering algorithms and tested them over four known datasets, used in the literature as baselines with agreed upon clusters. One additional method, Binary-Positive, developed by our team, was added to the analysis. The results affirm the unpredictable nature of the clustering process, point to different assumptions taken by different methods. One conclusion of the study is to carefully choose the appropriate clustering method for any given application.