MicroClAn: Microarray clustering analysis

Authors:
Giulia Bruno;Alessandro Fiori
Affiliations:
Dipartimento di Ingegneria Gestionale e della Produzione, Politecnico di Torino, Corso Duca degli Abruzzi 24, 10129, Torino, Italy;Fondazione Piemontese per la Ricerca sul Cancro-Onlus (FPRC), Institute for Cancer Research and Treatment (IRCC), Str. Prov. 142 Km. 3.95, 10060, Candiolo, Italy
Venue:
Journal of Parallel and Distributed Computing
Year:
2013

Citing 26
Cited 0

Silhouettes: a graphical aid to the interpretation and validation of cluster analysis

Journal of Computational and Applied Mathematics
Fast and effective text mining using linear-time document clustering

KDD '99 Proceedings of the fifth ACM SIGKDD international conference on Knowledge discovery and data mining
Rank aggregation methods for the Web

Proceedings of the 10th international conference on World Wide Web
A comparison of eleven static heuristics for mapping a class of independent tasks onto heterogeneous distributed computing systems

Journal of Parallel and Distributed Computing
Performance-Effective and Low-Complexity Task Scheduling for Heterogeneous Computing

IEEE Transactions on Parallel and Distributed Systems
On Clustering Validation Techniques

Journal of Intelligent Information Systems
Chameleon: Hierarchical Clustering Using Dynamic Modeling

Computer
DHC: A Density-Based Hierarchical Clustering Method for Time Series Gene Expression Data

BIBE '03 Proceedings of the 3rd IEEE Symposium on BioInformatics and BioEngineering
Mining coherent gene clusters from gene-sample-time microarray data

Proceedings of the tenth ACM SIGKDD international conference on Knowledge discovery and data mining
A framework for ontology-driven subspace clustering

Proceedings of the tenth ACM SIGKDD international conference on Knowledge discovery and data mining
Biclustering Algorithms for Biological Data Analysis: A Survey

IEEE/ACM Transactions on Computational Biology and Bioinformatics (TCBB)
Cluster Analysis for Gene Expression Data: A Survey

IEEE Transactions on Knowledge and Data Engineering
Attribute Clustering for Grouping, Selection, and Classification of Gene Expression Data

IEEE/ACM Transactions on Computational Biology and Bioinformatics (TCBB)
Introduction to Data Mining, (First Edition)

Introduction to Data Mining, (First Edition)
Efficiently Mining Gene Expression Data via a Novel Parameterless Clustering Method

IEEE/ACM Transactions on Computational Biology and Bioinformatics (TCBB)
Distributed clustering algorithms for data-gathering in wireless mobile sensor networks

Journal of Parallel and Distributed Computing
Weighted rank aggregation of cluster validation measures

Bioinformatics
Techniques for clustering gene expression data

Computers in Biology and Medicine
An improved algorithm for clustering gene expression data

Bioinformatics
Efficient Fragmentation of Large XML Documents

DEXA '07 Proceedings of the 18th international conference on Database and Expert Systems Applications
Enabling OLAP in mobile environments via intelligent data cube compression techniques

Journal of Intelligent Information Systems
An ant colony optimization approach to a grid workflow scheduling problem with various QoS requirements

IEEE Transactions on Systems, Man, and Cybernetics, Part C: Applications and Reviews
Data-intensive document clustering on graphics processing unit (GPU) clusters

Journal of Parallel and Distributed Computing
Parallel Spectral Clustering in Distributed Systems

IEEE Transactions on Pattern Analysis and Machine Intelligence
A Cluster Separation Measure

IEEE Transactions on Pattern Analysis and Machine Intelligence
Measuring gene similarity by means of the classification distance

Knowledge and Information Systems

Quantified Score

Hi-index	0.00

Visualization

Abstract

Evaluating clustering results is a fundamental task in microarray data analysis, due to the lack of enough biological knowledge to know in advance the true partition of genes. Many quality indexes for gene clustering evaluation have been proposed. A critical issue in this domain is to compare and aggregate quality indexes to select the best clustering algorithm and the optimal parameter setting for a dataset. Furthermore, due to the huge amount of data generated by microarray experiments and the requirement of external resources such as ontologies to compute biological indexes, another critical issue is the performance decline in term of execution time. Thus, the distributed computation of algorithms and quality indexes becomes essential. Addressing these issues, this paper presents the MicroClAn framework, a distributed system to evaluate and compare clustering algorithms using the most exploited quality indexes. The best solution is selected through a two-step ranking aggregation of the ranks produced by quality indexes. A new index oriented to the biological validation of microarray clustering results is also introduced. Several scheduling strategies integrated in the framework allow to distribute tasks in the grid environment to optimize the completion time. Experimental results show the effectiveness of our aggregation strategy in identifying the best rank among different clustering algorithms. Moreover, our framework achieves good performance in terms of completion time with few computational resources.