Relaxed heaps: an alternative to Fibonacci heaps with applications to parallel computation
Communications of the ACM
Efficiency of hierarchic agglomerative clustering using the ICL distributed array processor
Journal of Documentation
Parallel algorithms for hierarchical clustering
Parallel Computing
Open source clustering software
Bioinformatics
Optimal implementations of UPGMA and other common clustering algorithms
Information Processing Letters
Parallel Clustering Algorithm for Large Data Sets with Applications in Bioinformatics
IEEE/ACM Transactions on Computational Biology and Bioinformatics (TCBB)
Clustering performance data efficiently at massive scales
Proceedings of the 24th ACM International Conference on Supercomputing
eXploratory K-Means: A new simple and efficient algorithm for gene clustering
Applied Soft Computing
p-PIC: Parallel power iteration clustering for big data
Journal of Parallel and Distributed Computing
Evolutionary k-means for distributed data sets
Neurocomputing
Hi-index | 0.00 |
Identification of groups of genes that manifest similar expression patters is a key step in the analysis of gene expression data. Hierarchical clustering is developed for that purpose. A fundamental problem with the previous implementations of this clustering method is its limitation to handle large data sets within a reasonable time and memory resources. In this paper, we present a parallel approach for solving this problem. Implementation of the parallel algorithm is illustrated on data from high dimensional microarray experiments related to the gene expression in cancerous disease and Arabidopsis seedling growth. They show considerable reduction in computational time and inter-node communication overhead, especially for large data sets.