Aggregate distance based clustering using fibonacci series-FIBCLUS

Authors:
Rakesh Rawat;Richi Nayak;Yuefeng Li;Slah Alsaleh
Affiliations:
Faculty of Science and Technology, Queensland University of University, Brisbane Australia;Faculty of Science and Technology, Queensland University of University, Brisbane Australia;Faculty of Science and Technology, Queensland University of University, Brisbane Australia;Faculty of Science and Technology, Queensland University of University, Brisbane Australia
Venue:
APWeb'11 Proceedings of the 13th Asia-Pacific web conference on Web technologies and applications
Year:
2011

Citing 13
Cited 0

Toward memory-based reasoning

Communications of the ACM - Special issue on parallelism
Fibonacci heaps and their uses in improved network optimization algorithms

Journal of the ACM (JACM)
BIRCH: an efficient data clustering method for very large databases

SIGMOD '96 Proceedings of the 1996 ACM SIGMOD international conference on Management of data
CACTUS—clustering categorical data using summaries

KDD '99 Proceedings of the fifth ACM SIGKDD international conference on Knowledge discovery and data mining
ROCK: a robust clustering algorithm for categorical attributes

Information Systems
COOLCAT: an entropy-based algorithm for categorical clustering

Proceedings of the eleventh international conference on Information and knowledge management
Extensions to the k-Means Algorithm for Clustering Large Data Sets with Categorical Values

Data Mining and Knowledge Discovery
Clustering categorical data: an approach based on dynamical systems

The VLDB Journal — The International Journal on Very Large Data Bases
An association-based dissimilarity measure for categorical data

Pattern Recognition Letters
Data Mining: Practical Machine Learning Tools and Techniques, Second Edition (Morgan Kaufmann Series in Data Management Systems)

Data Mining: Practical Machine Learning Tools and Techniques, Second Edition (Morgan Kaufmann Series in Data Management Systems)
A method to compute distance between two categorical values of same attribute in unsupervised learning for categorical data set

Pattern Recognition Letters
Supervised Classification Fuzzy Growing Hierarchical SOM

HAIS '08 Proceedings of the 3rd international workshop on Hybrid Artificial Intelligence Systems
Clustering based on compressed data for categorical and mixed attributes

SSPR'06/SPR'06 Proceedings of the 2006 joint IAPR international conference on Structural, Syntactic, and Statistical Pattern Recognition

Quantified Score

Hi-index	0.00

Visualization

Abstract

This paper proposes an innovative instance similarity based evaluation metric that reduces the search map for clustering to be performed. An aggregate global score is calculated for each instance using the novel idea of Fibonacci series. The use of Fibonacci numbers is able to separate the instances effectively and, in hence, the intra-cluster similarity is increased and the intercluster similarity is decreased during clustering. The proposed FIBCLUS algorithm is able to handle datasets with numerical, categorical and a mix of both types of attributes. Results obtained with FIBCLUS are compared with the results of existing algorithms such as k-means, x-means expected maximization and hierarchical algorithms that are widely used to cluster numeric, categorical and mix data types. Empirical analysis shows that FIBCLUS is able to produce better clustering solutions in terms of entropy, purity and F-score in comparison to the above described existing algorithms.