A case for high performance computing with virtual machines
Proceedings of the 20th annual international conference on Supercomputing
MapReduce: simplified data processing on large clusters
Communications of the ACM - 50th anniversary issue: 1958 - 2008
ESCIENCE '08 Proceedings of the 2008 Fourth IEEE International Conference on eScience
Bioinformatics
Minimal-overhead virtualization of a large scale supercomputer
Proceedings of the 7th ACM SIGPLAN/SIGOPS international conference on Virtual execution environments
Cloud Technologies for Bioinformatics Applications
IEEE Transactions on Parallel and Distributed Systems
Hi-index | 0.00 |
Recently the importance of genomic data analysis has been growing; one realizes necessity of the personalized treatment of human cancers. Next generation sequencing (NGS) technique is a cost-effective way to obtain such data sets for cancer data analysis. Hence, most of bioinformatics research groups use the NGS technique to obtain such data sets. The amount of NGS data is huge and rapidly growing; therefore, it requires supercomputing systems to be handled within a reasonable time. Bioinformatics researchers analyze the sets by using NGS applications such as BWA and BowTie, but those legacy applications have limited scalability and resource utilization on supercomputing systems.To resolve this situation, we developed a virtualized technique by improving the resource utilization and scalability of NGS applications. First, to improve resource utilization, the virtualized system architecture is built by allocating virtual machines considering the limitation of resource utilization. Second, the virtualized system architecture considering data locality is presented to improve scalability. Finally, experimental results show that our virtualized system achieved approximately 30 % better performance than native systems. In addition, the performance of the system considering data locality achieves a speedup twice that of a system using a single-storage server.