Heterogeneous clustering ensemble method for combining different cluster results

Authors:
Hye-Sung Yoon;Sun-Young Ahn;Sang-Ho Lee;Sung-Bum Cho;Ju Han Kim
Affiliations:
Department of Computer Science and Engineering, Ewha Womans University, Seoul, Korea;Department of Computer Science and Engineering, Ewha Womans University, Seoul, Korea;Department of Computer Science and Engineering, Ewha Womans University, Seoul, Korea;Seoul National University Biomedical Informatics (SNUBI), Seoul National University College of Medicine, Seoul, Korea;Seoul National University Biomedical Informatics (SNUBI), Seoul National University College of Medicine, Seoul, Korea
Venue:
BioDM'06 Proceedings of the 2006 international conference on Data Mining for Biomedical Applications
Year:
2006

Citing 10
Cited 7

Discovering Clusters in Gene Expression Data Using Evolutionary Approach

ICTAI '03 Proceedings of the 15th IEEE International Conference on Tools with Artificial Intelligence
Cluster ensemble and its applications in gene expression analysis

APBC '04 Proceedings of the second conference on Asia-Pacific bioinformatics - Volume 29
Integration of Cluster Ensemble and Text Summarization for Gene Expression Analysis

BIBE '04 Proceedings of the 4th IEEE Symposium on Bioinformatics and Bioengineering
Ensemble Clustering in Medical Diagnostics

CBMS '04 Proceedings of the 17th IEEE Symposium on Computer-Based Medical Systems
Adaptive Clustering Ensembles

ICPR '04 Proceedings of the Pattern Recognition, 17th International Conference on (ICPR'04) Volume 1 - Volume 01
Model-based overlapping clustering

Proceedings of the eleventh ACM SIGKDD international conference on Knowledge discovery in data mining
Clustering of diverse genomic data using information fusion

Bioinformatics
Multiclass cancer classification and biomarker discovery using GA-based algorithms

Bioinformatics
Ensemble dependence model for classification and prediction of cancer and normal gene expression data

Bioinformatics
Integrating heterogeneous microarray data sources using correlation signatures

DILS'05 Proceedings of the Second international conference on Data Integration in the Life Sciences

A survey of evolutionary algorithms for clustering

IEEE Transactions on Systems, Man, and Cybernetics, Part C: Applications and Reviews
A review: accuracy optimization in clustering ensembles using genetic algorithms

Artificial Intelligence Review
Weighted association based methods for the combination of heterogeneous partitions

Pattern Recognition Letters
A novel framework for discovering robust cluster results

DS'06 Proceedings of the 9th international conference on Discovery Science
Cluster ensembles

Wiley Interdisciplinary Reviews: Data Mining and Knowledge Discovery
A hierarchical clusterer ensemble method based on boosting theory

Knowledge-Based Systems
A theoretic framework of K-means-based consensus clustering

IJCAI'13 Proceedings of the Twenty-Third international joint conference on Artificial Intelligence

Quantified Score

Hi-index	0.00

Visualization

Abstract

Biological data set sizes have been growing rapidly with the technological advances that have occurred in bioinformatics. Data mining techniques have been used extensively as approaches to detect interesting patterns in large databases. In bioinformatics, clustering algorithm technique for data mining can be applied to find underlying genetic and biological interactions, without considering prior information from datasets. However, many clustering algorithms are practically available, and different clustering algorithms may generate dissimilar clustering results due to bio-data characteristics and experimental assumptions. In this paper, we propose a novel heterogeneous clustering ensemble scheme that uses a genetic algorithm to generate high quality and robust clustering results with characteristics of bio-data. The proposed method combines results of various clustering algorithms and crossover operation of genetic algorithm, and is founded on the concept of using the evolutionary processes to select the most commonly-inherited characteristics. Our framework proved to be available on real data set and the optimal clustering results generated by means of our proposed method are detailed in this paper. Experimental results demonstrate that the proposed method yields better clustering results than applying a single best clustering algorithm.