Computer architecture and organization; (2nd ed.)
Computer architecture and organization; (2nd ed.)
Building and Using a Highly Parallel Programmable Logic Array
Computer - Special issue on experimental research in computer architecture
Kestrel: A Programmable Array for Sequence Analysis
Journal of VLSI Signal Processing Systems - Special issue on application specific systems, architectures and processors
Evaluation of the streams-C C-to-FPGA compiler: an applications perspective
FPGA '01 Proceedings of the 2001 ACM/SIGDA ninth international symposium on Field programmable gate arrays
A linear space algorithm for computing maximal common subsequences
Communications of the ACM
Parallel Computer Architecture: A Hardware/Software Approach
Parallel Computer Architecture: A Hardware/Software Approach
Scalable Parallel Computing: Technology,Architecture,Programming
Scalable Parallel Computing: Technology,Architecture,Programming
JPEG 2000: Image Compression Fundamentals, Standards and Practice
JPEG 2000: Image Compression Fundamentals, Standards and Practice
Computer architecture: a quantitative approach
Computer architecture: a quantitative approach
Computers and Intractability: A Guide to the Theory of NP-Completeness
Computers and Intractability: A Guide to the Theory of NP-Completeness
Massively Parallel Solutions for Molecular Sequence Analysis
IPDPS '02 Proceedings of the 16th International Parallel and Distributed Processing Symposium
RaPiD - Reconfigurable Pipelined Datapath
FPL '96 Proceedings of the 6th International Workshop on Field-Programmable Logic, Smart Applications, New Paradigms and Compilers
Explicit SIMD Programming for Asynchronous Applications
ASAP '00 Proceedings of the IEEE International Conference on Application-Specific Systems, Architectures, and Processors
Multiprecision Division on an 8-bit Processor
ARITH '97 Proceedings of the 13th Symposium on Computer Arithmetic (ARITH '97)
Kestrel: Design of an 8-bit SIMD Parallel Processor
ARVLSI '97 Proceedings of the 17th Conference on Advanced Research in VLSI (ARVLSI '97)
A Parallelizing Method for Implementing Image Processing Tasks on SIMD Linear Processor Arrays
CAMP '97 Proceedings of the 1997 Computer Architectures for Machine Perception (CAMP '97)
Processor autonomy and its effect on parallel program execution
FRONTIERS '96 Proceedings of the 6th Symposium on the Frontiers of Massively Parallel Computation
Optimizing neural networks on SIMD parallel computers
Parallel Computing
Streaming Algorithms for Biological Sequence Alignment on GPUs
IEEE Transactions on Parallel and Distributed Systems
High-speed Multiple Sequence Alignment on a reconfigurable platform
International Journal of Bioinformatics Research and Applications
A stream chip-multiprocessor for bioinformatics
ACM SIGARCH Computer Architecture News
Finding the Next Computational Model: Experience with the UCSC Kestrel
Journal of Signal Processing Systems
Integrating FPGA acceleration into HMMer
Parallel Computing
A Massively Parallel Architecture for Bioinformatics
ICCS '09 Proceedings of the 9th International Conference on Computational Science: Part I
High speed biological sequence analysis with hiddenMarkov models on reconfigurable platforms
IEEE Transactions on Information Technology in Biomedicine - Special section on computational intelligence in medical systems
Applying SIMD approach to whole genome comparison on commodity hardware
PPAM'07 Proceedings of the 7th international conference on Parallel processing and applied mathematics
Parallel DNA sequence alignment on the cell broadband engine
PPAM'07 Proceedings of the 7th international conference on Parallel processing and applied mathematics
Long DNA sequence comparison on multicore architectures
Euro-Par'10 Proceedings of the 16th international Euro-Par conference on Parallel processing: Part II
Bio-sequence database scanning on a GPU
IPDPS'06 Proceedings of the 20th international conference on Parallel and distributed processing
FPGA-based fine-grain parallel computing (abstract only)
Proceedings of the 19th ACM/SIGDA international symposium on Field programmable gate arrays
Accelerating the viterbi algorithm for profile hidden markov models using reconfigurable hardware
ICCS'06 Proceedings of the 6th international conference on Computational Science - Volume Part I
Biological sequence analysis with hidden markov models on an FPGA
ACSAC'05 Proceedings of the 10th Asia-Pacific conference on Advances in Computer Systems Architecture
Characterization of Smith-Waterman sequence database search in X10
Proceedings of the 2012 ACM SIGPLAN X10 Workshop
Hi-index | 0.00 |
The architectural landscape of high-performance computing stretches from superscalar uniprocessor to explicitly parallel systems to dedicated hardware implementations of algorithms. Single-purpose hardware can achieve the highest performance and uniprocessors can be the most programmable. Between these extremes, programmable and reconfigurable architectures provide a wide range of choice in flexibility, programmability, computational density, and performance. The UCSC Kestrel parallel processor strives to attain single-purpose performance while maintaining user programmability. Kestrel is a single-instruction stream, multiple-data stream (SIMD) parallel processor with a 512-element linear array of 8-bit processing elements. The system design focuses on efficient high-throughput DNA and protein sequence analysis, but its programmability enables high performance on computational chemistry, image processing, machine learning, and other applications. The Kestrel system has had unexpected longevity in its utility due to a careful design and analysis process. Experience with the system leads to the conclusion that programmable SIMD architectures can excel in both programmability and performance. This paper presents the architecture, implementation, applications, and observations of the Kestrel project at the University of California at Santa Cruz.