Computer Networks
Grid-Oriented Storage: A Single-Image, Cross-Domain, High-Bandwidth Architecture
IEEE Transactions on Computers
PVFS: a parallel file system for linux clusters
ALS'00 Proceedings of the 4th annual Linux Showcase & Conference - Volume 4
Hi-index | 0.00 |
This paper examines and investigates the relationship between bioinformatics data processing and its underlying computing architecture within the context of the International Nucleotide Sequence Database Collaboration (INSDC). INSDC exchanges sequence data on a daily basis across its three member organizations in USA, UK and Japan. We studied how this sequence database in MySQL can best take advantage of the increased transfer bandwidth of a grid-based storage architecture. Within the context of the UK Government Project "Grid-oriented Storage (GOS)" and the EC Project "EuroAsiaGrid," GOS has been developed in our lab, which melds parallel streaming technique to meet the needs of WAN/Grid-based virtual organizations. A real-world test shows that the INSDC sequence database backuping operation, mysqldump, over the pipelined GOS architecture beats those over the classic infrastructures by six times over the link between Cambridge and Tokyo. When performing genomic sequence search against one million records via the underlying GOS architecture, the performance improvement of 67.3% has been achieved.