Algorithms for clustering data
Algorithms for clustering data
Center CLICK: A Clustering Algorithm with Applications to Gene Expression Analysis
Proceedings of the Eighth International Conference on Intelligent Systems for Molecular Biology
Adaptive dimension reduction for clustering high dimensional data
ICDM '02 Proceedings of the 2002 IEEE International Conference on Data Mining
A unified framework for model-based clustering
The Journal of Machine Learning Research
Meanshift Clustering for DNA Microarray Analysis
CSB '04 Proceedings of the 2004 IEEE Computational Systems Bioinformatics Conference
COMPACT: a comparative package for clustering assessment
ISPA'05 Proceedings of the 2005 international conference on Parallel and Distributed Processing and Applications
Hi-index | 0.00 |
Clustering algorithms are employed in many bioinformatics tasks, including categorization of protein sequences and analysis of gene-expression data. Although these algorithms are routinely applied, many of them suffer from the following limitations: (i) relying on predetermined parameters tuning, such as a-priori knowledge regarding the number of clusters; (ii) involving nondeterministic procedures that yield inconsistent outcomes. Thus, a framework that addresses these shortcomings is desirable. We provide a data-driven framework that includes two interrelated steps. The first one is SVD-based dimension reduction and the second is an automated tuning of the algorithm's parameter(s). The dimension reduction step is efficiently adjusted for very large datasets. The optimal parameter setting is identified according to the internal evaluation criterion known as Bayesian Information Criterion (BIC). This framework can incorporate most clustering algorithms and improve their performance. In this study we illustrate the effectiveness of this platform by incorporating the standard K-Means and the Quantum Clustering algorithms. The implementations are applied to several gene-expression benchmarks with significant success.