Reconfigurable hardware solution to parallel prefix computation

Authors:
Jin Hwan Park;H. K. Dai
Affiliations:
Department of Computer Science, State University of New York at New Paltz, New Paltz, USA 12561;Computer Science Department, Oklahoma State University, Stillwater, USA 74078
Venue:
The Journal of Supercomputing
Year:
2008

Citing 29
Cited 3

A Heuristic for Suffix Solutions

IEEE Transactions on Computers
Optimal and sublogarithmic time randomized parallel sorting algorithms

SIAM Journal on Computing
Faster optimal parallel prefix sums and list ranking

Information and Computation
The parallel simplicity of compaction and chaining

Journal of Algorithms
Highly parallel computing

Highly parallel computing
Parallel computing using the prefix problem

Parallel computing using the prefix problem
The parallel complexity of integer prefix summation

Information Processing Letters
The Strict Time Lower Bound and Optimal Schedules for Parallel Prefix with Resource Constraints

IEEE Transactions on Computers
Parallel computation: models and methods

Parallel computation: models and methods
Compile-Time Scheduling of Dynamic Constructs in Dataflow Program Graphs

IEEE Transactions on Computers
Parallel Prefix Computation

Journal of the ACM (JACM)
Algorithms sequential & parallel: a unified approach

Algorithms sequential & parallel: a unified approach
Introduction to Parallel Processing: Algorithms and Architectures

Introduction to Parallel Processing: Algorithms and Architectures
Parallel prefix computation on extended multi-mesh network

Information Processing Letters
Prefix Computations on Symmetric Multiprocessors

IPPS '99/SPDP '99 Proceedings of the 13th International Symposium on Parallel Processing and the 10th Symposium on Parallel and Distributed Processing
Efficient Algorithms for Prefix and General Prefix Computations on Distributed Shared Memory Systems with Applications

ICPADS '97 Proceedings of the 1997 International Conference on Parallel and Distributed Systems
New bounds for parallel prefix circuits

STOC '83 Proceedings of the fifteenth annual ACM symposium on Theory of computing
Efficient Parallel Prefix Algorithms on Fully Connected Message-Passing Computers

HIPC '96 Proceedings of the Third International Conference on High-Performance Computing (HiPC '96)
DESIGN OF A HIGH SPEED STRING MATCHING CO-PROCESSOR FOR NLP

VLSID '03 Proceedings of the 16th International Conference on VLSI Design
Algorithms

Algorithms
Parallel Prefix Adder Design

ARITH '01 Proceedings of the 15th IEEE Symposium on Computer Arithmetic
Z4: a new depth-size optimal parallel prefix circuit with small depth

Neural, Parallel & Scientific Computations
A new approach to constructing optimal parallel prefix circuits with small depth

Journal of Parallel and Distributed Computing
Families of FPGA-Based Algorithms for Approximate String Matching

ASAP '04 Proceedings of the Application-Specific Systems, Architectures and Processors, 15th IEEE International Conference
High-Speed Parallel-Prefix VLSI Ling Adders

IEEE Transactions on Computers
Reconfigurable Parallel Approximate String Matching on FPGAs

DSD '05 Proceedings of the 8th Euromicro Conference on Digital System Design
On the construction of zero-deficiency parallel prefix circuits with minimum depth

ACM Transactions on Design Automation of Electronic Systems (TODAES)
On-line adaptive parallel prefix computation

Euro-Par'06 Proceedings of the 12th international conference on Parallel Processing
A parallel algorithm for finding all successive minimal maximum subsequences

LATIN'06 Proceedings of the 7th Latin American conference on Theoretical Informatics

Parallel prefix algorithms on the multicomputer

WSEAS Transactions on Computer Research
New parallel prefix algorithms

AIC'09 Proceedings of the 9th WSEAS international conference on Applied informatics and communications
New families of computation-efficient parallel prefix algorithms

WSEAS Transactions on Computers

Quantified Score

Hi-index	0.00

Visualization

Abstract

This paper presents the design and implementation of an efficient reconfigurable parallel prefix computation hardware on field-programmable gate arrays (FPGAs). The design is based on a pipelined dataflow algorithm, and control logic is added to reconfigure the system for arbitrary parallelism degree. The system receives multiple input streams of elements in parallel and produces output streams in parallel. It has an advantage of controlling the degree of parallelism explicitly at run time. The time complexity of the design is O(d+(N驴d)/d), where d and N are parallelism degree and stream size, respectively. When the stream size is sufficiently larger than the initial trigger time of the pipeline (d), the time complexity becomes O(N/d). Unlike the prefix computation circuits found in the literature, the design is scalable for different problem sizes including unknown sized data. The design is modular based on a finite state machine, and implemented and tested for target FPGA devices Xilinx Spartan2S XC2S300EFT256-6Q and XC2S600EFG676-6.