Parametrizing multicore architectures for multiple sequence alignment

Authors:
Sebastian Isaza;Friman Sanchez;Felipe Cabarcas;Alex Ramirez;Georgi Gaydadjiev
Affiliations:
Delft University of Technology, The Netherlands;Technical University of Catalonia, Spain;Technical University of Catalonia and Barcelona Supercomputing Center, Spain;Technical University of Catalonia and Barcelona Supercomputing Center, Spain;Delft University of Technology, The Netherlands
Venue:
Proceedings of the 8th ACM International Conference on Computing Frontiers
Year:
2011

Citing 13
Cited 0

Bioinformatics—an introduction for computer scientists

ACM Computing Surveys (CSUR)
Introduction to the cell multiprocessor

IBM Journal of Research and Development - POWER5 and packaging
Striped Smith--Waterman speeds database searches six times over other SIMD implementations

Bioinformatics
Preliminary Analysis of the Cell BE Processor Limitations for Sequence Alignment Applications

SAMOS '08 Proceedings of the 8th international workshop on Embedded Computer Systems: Architectures, Modeling, and Simulation
Exploring the viability of the Cell Broadband Engine for bioinformatics applications

Parallel Computing
Validity of the single processor approach to achieving large scale computing capabilities

AFIPS '67 (Spring) Proceedings of the April 18-20, 1967, spring joint computer conference
An efficient implementation of Smith Waterman algorithm on GPU using CUDA, for massively parallel scanning of sequence databases

IPDPS '09 Proceedings of the 2009 IEEE International Symposium on Parallel&Distributed Processing
Parallel CLUSTAL W for PC clusters

ICCSA'03 Proceedings of the 2003 international conference on Computational science and its applications: PartII
Scalability Analysis of Progressive Alignment on a Multicore

CISIS '10 Proceedings of the 2010 International Conference on Complex, Intelligent and Software Intensive Systems
Accelerating Multiple Sequence Alignment with the Cell BE Processor

The Computer Journal
A Coarse Grain Reconfigurable Architecture for sequence alignment problems in bio-informatics

SASP '10 Proceedings of the 2010 IEEE 8th Symposium on Application Specific Processors (SASP)
The SARC Architecture

IEEE Micro
A highly parameterized and efficient FPGA-based skeleton for pairwise biological sequence alignment

IEEE Transactions on Very Large Scale Integration (VLSI) Systems

Quantified Score

Hi-index	0.00

Visualization

Abstract

Sequence alignment is one of the fundamental tasks in bioinformatics. Due to the exponential growth of biological data and the computational complexity of the algorithms used, high performance computing systems are required. Although multicore architectures have the potential of exploiting the task-level parallelism found in these workloads, efficiently harnessing systems with hundreds of cores requires deep understanding of the applications and the architecture. When incorporating large numbers of cores, performance scalability will likely saturate shared hardware resources like buses and memories. In this paper we evaluate the performance impact of various configurations of an accelerator-based multicore architecture with the aim of revealing and quantifying the bottlenecks. Then, we compare against a multicore using general-purpose processors and discuss the performance gap. Our target application is ClustalW, one of the most popular programs for Multiple Sequence Alignment. Different input data sets are characterized and we show how they influence performance. Simulation results show that due to the high computation-to-communication ratio and the transfer of data in large chunks, memory latency is well tolerated. However, bandwidth is critical to achieving maximum performance. Using a 32KB cache configuration with 4 banks can capture most of the memory traffic and therefore avoid expensive off-chip transactions. On the other hand, using a hardware queue for the tasks synchronization allows us to handle a large number of cores. Finally, we show that using a simple load balancing strategy, we can increase performance of general-purpose cores by 28%.