Improving the research environment of high performance computing for non-cluster experts based on knoppix instant computing technology

  • Authors:
  • Fumikazu Konishi;Manabu Ishii;Shingo Ohki;Yusuke Hamano;Shuichi Fukuda;Akihiko Konagaya

  • Affiliations:
  • Advanced Genome Information Technology Research Group, Bioknowledge Federation Research Team, RIKEN Genomic Science Center (GSC);Tokyo Metropolitan Institute of Technology;Advanced Genome Information Technology Research Group, Bioknowledge Federation Research Team, RIKEN Genomic Science Center (GSC);VSN Inc;Tokyo Metropolitan Institute of Technology;Advanced Genome Information Technology Research Group, Bioknowledge Federation Research Team, RIKEN Genomic Science Center (GSC)

  • Venue:
  • Euro-Par'06 Proceedings of the 12th international conference on Parallel Processing
  • Year:
  • 2006

Quantified Score

Hi-index 0.00

Visualization

Abstract

We have designed and implemented a new portable system that can rapidly construct a computer environment where high-throughput research applications can be performed instantly. One challenge in the instant computing area is constructing a cluster system instantly, and then readily restoring it to its former state. This paper presents an approach for instant computing using Knoppix technology that can allow even a non-computer specialist to easily construct and operate a Beowulf cluster . In the present bio-research field, there is now an urgent need to address the nagging problem posed by having high-performance computers. Therefore, we were assigned the task of proposing a way to build an environment where a cluster computer system can be instantly set up. Through such research, we believe that the technology can be expected to accelerate scientific research. However, when employing this technology in bio-research, a capacity barrier exists when selecting a clustered Knoppix system for a data-driven bioinformatics application. We have approached ways to overcome said barrier by using a virtual integrated RAM-DISK to adapt to a parallel file system. To show an actual example using a reference application, we have chosen InterProScan, which is an integrated application prepared by the European Bioinformatics Institute (EBI) that utilizes many database and scan methods. InterProScan is capable of scaling workload with local computational resources, though biology researchers and even bioinformatics researchers find such extensions difficult to set up. We have achieved the purpose of allowing even researchers who are non-cluster experts to easily build a system of ”Knoppix for the InterProScan4.1 High Throughput Computing Edition.” The system we developed is capable of not only constructing a cluster computer environment composed of 32 computers in about ten minutes (as opposed to six hours when done manually), but also restoring the original environment by rebooting the pre-existing operating system. The goal of our instant cluster computing is to provide an environment in which any target application can be built instantly from anywhere.