Cloud Technologies for Bioinformatics Applications

Authors:
Jaliya Ekanayake;Thilina Gunarathne;Judy Qiu
Affiliations:
Indiana University, Bloomington;Indiana University, Bloomington;Indiana University, Bloomington
Venue:
IEEE Transactions on Parallel and Distributed Systems
Year:
2011

Citing 0
Cited 7

More convenient more overhead: the performance evaluation of Hadoop streaming

Proceedings of the 2011 ACM Symposium on Research in Applied Computation
Cloud-based image processing system with priority-based data distribution mechanism

Computer Communications
Scalable parallel computing on clouds using Twister4Azure iterative MapReduce

Future Generation Computer Systems
A Multiclass Classification Tool Using Cloud Computing Architecture

ASONAM '12 Proceedings of the 2012 International Conference on Advances in Social Networks Analysis and Mining (ASONAM 2012)
Don't match twice: redundancy-free similarity computation with MapReduce

Proceedings of the Second Workshop on Data Analytics in the Cloud
Development of a virtualized supercomputing environment for genomic analysis

The Journal of Supercomputing
A Study on Linear Elastic FEM by Cloud Computing

Proceedings of the Second International Conference on Innovative Computing and Cloud Computing

Quantified Score

Hi-index	0.00

Visualization

Abstract

Executing large number of independent jobs or jobs comprising of large number of tasks that perform minimal intertask communication is a common requirement in many domains. Various technologies ranging from classic job schedulers to the latest cloud technologies such as MapReduce can be used to execute these "many-tasks” in parallel. In this paper, we present our experience in applying two cloud technologies Apache Hadoop and Microsoft DryadLINQ to two bioinformatics applications with the above characteristics. The applications are a pairwise Alu sequence alignment application and an Expressed Sequence Tag (EST) sequence assembly program. First, we compare the performance of these cloud technologies using the above applications and also compare them with traditional MPI implementation in one application. Next, we analyze the effect of inhomogeneous data on the scheduling mechanisms of the cloud technologies. Finally, we present a comparison of performance of the cloud technologies under virtual and nonvirtual hardware platforms.