FPF-SB: a scalable algorithm for microarray gene expression data clustering

Authors:
Filippo Geraci;Mauro Leoncini;Manuela Montangero;Marco Pellegrini;M. Elena Renda
Affiliations:
CNR, Istituto di Informatica e Telematica, Pisa, Italy and Dipartimento di Ingegneria dell'Informazione, Università di Siena, Siena, Italy;CNR, Istituto di Informatica e Telematica, Pisa, Italy and Dipartimento di Ingegneria dell'Informazione, Università di Modena e Reggio Emilia, Modena, Italy;CNR, Istituto di Informatica e Telematica, Pisa, Italy and Dipartimento di Ingegneria dell'Informazione, Università di Modena e Reggio Emilia, Modena, Italy;CNR, Istituto di Informatica e Telematica, Pisa, Italy;CNR, Istituto di Informatica e Telematica, Pisa, Italy
Venue:
ICDHM'07 Proceedings of the 1st international conference on Digital human modeling
Year:
2007

Citing 6
Cited 1

Optimal algorithms for approximate clustering

STOC '88 Proceedings of the twentieth annual ACM symposium on Theory of computing
Cluster Analysis for Gene Expression Data: A Survey

IEEE Transactions on Knowledge and Data Engineering
Fuzzy J-Means and VNS methods for clustering genes from microarray data

Bioinformatics
Clustering short time series gene expression data

Bioinformatics
A scalable algorithm for high-quality clustering of web snippets

Proceedings of the 2006 ACM symposium on Applied computing
Incorporating biological knowledge into distance-based clustering analysis of microarray gene expression data

Bioinformatics

Clustering biological data using voronoi diagram

ADCONS'11 Proceedings of the 2011 international conference on Advanced Computing, Networking and Security

Quantified Score

Hi-index	0.00

Visualization

Abstract

Efficient and effective analysis of large datasets from microarray gene expression data is one of the keys to time-critical personalized medicine. The issue we address here is the scalability of the data processing software for clustering gene expression data into groups with homogeneous expression profile. In this paper we propose FPF-SB, a novel clustering algorithm based on a combination of the Furthest-Point-First (FPF) heuristic for solving the k- center problem and a stability-based method for determining the number of clusters k. Our algorithm improves the state of the art: it is scalable to large datasets without sacrificing output quality.