A General Reconfigurable Architecture for the BLAST Algorithm
Journal of VLSI Signal Processing Systems
Single pass streaming BLAST on FPGAs
Parallel Computing
A high performance fpga-based implementation of position specific iterated blast
Proceedings of the ACM/SIGDA international symposium on Field programmable gate arrays
Overview of the Blue Gene/L system architecture
IBM Journal of Research and Development
Network-on-Chip Hardware Accelerators for Biological Sequence Alignment
IEEE Transactions on Computers
Search and clustering orders of magnitude faster than BLAST
Bioinformatics
Bridging the GPGPU-FPGA efficiency gap
Proceedings of the 19th ACM/SIGDA international symposium on Field programmable gate arrays
Bioinformatics
A performance and energy comparison of FPGAs, GPUs, and multicores for sliding-window applications
Proceedings of the ACM/SIGDA international symposium on Field Programmable Gate Arrays
Towards systolic hardware acceleration for local complexity analysis of massive genomic data
Proceedings of the great lakes symposium on VLSI
Hi-index | 0.00 |
While genomics have significantly advanced modern biological achievements, it requires extensive computational power, traditionally employed on large-scale cluster machines as well as multi-core systems. However, emerging research results show that FPGA-based acceleration of algorithms for genomic applications greatly improves the performance and energy efficiency when compared to multi-core systems and clusters. In this work, we present a parallel, hardware acceleration architecture of the CAST (Complexity Analysis of Sequence Tracts) algorithm, employed by biologists for complexity analysis of protein sequences encoded in genomic data. CAST is used for detecting (and subsequently masking) low-complexity regions (LCRs) in protein sequences. We designed and implemented the CAST accelerator architecture and built an FPGA prototype, with the purpose of benchmarking its performance against serial and multithreaded implementations of the CAST algorithm in software. The proposed architecture achieves remarkable speedup compared to both serial and multithreaded software CAST implementations ranging from approx. 100x-5000x, depending on the system configuration and the dataset features, such as low-complexity content and sequence length distribution. Such performance may enable complex analyses of voluminous sequence datasets, and has the potential to interoperate with other hardware architectures for protein sequence analysis.