A Heuristic for Suffix Solutions
IEEE Transactions on Computers
Optimal and sublogarithmic time randomized parallel sorting algorithms
SIAM Journal on Computing
Faster optimal parallel prefix sums and list ranking
Information and Computation
The parallel simplicity of compaction and chaining
Journal of Algorithms
Highly parallel computing
Parallel computing using the prefix problem
Parallel computing using the prefix problem
The parallel complexity of integer prefix summation
Information Processing Letters
The Strict Time Lower Bound and Optimal Schedules for Parallel Prefix with Resource Constraints
IEEE Transactions on Computers
Parallel computation: models and methods
Parallel computation: models and methods
Compile-Time Scheduling of Dynamic Constructs in Dataflow Program Graphs
IEEE Transactions on Computers
Journal of the ACM (JACM)
Algorithms sequential & parallel: a unified approach
Algorithms sequential & parallel: a unified approach
Introduction to Parallel Processing: Algorithms and Architectures
Introduction to Parallel Processing: Algorithms and Architectures
Parallel prefix computation on extended multi-mesh network
Information Processing Letters
Prefix Computations on Symmetric Multiprocessors
IPPS '99/SPDP '99 Proceedings of the 13th International Symposium on Parallel Processing and the 10th Symposium on Parallel and Distributed Processing
ICPADS '97 Proceedings of the 1997 International Conference on Parallel and Distributed Systems
New bounds for parallel prefix circuits
STOC '83 Proceedings of the fifteenth annual ACM symposium on Theory of computing
Efficient Parallel Prefix Algorithms on Fully Connected Message-Passing Computers
HIPC '96 Proceedings of the Third International Conference on High-Performance Computing (HiPC '96)
DESIGN OF A HIGH SPEED STRING MATCHING CO-PROCESSOR FOR NLP
VLSID '03 Proceedings of the 16th International Conference on VLSI Design
Algorithms
ARITH '01 Proceedings of the 15th IEEE Symposium on Computer Arithmetic
Z4: a new depth-size optimal parallel prefix circuit with small depth
Neural, Parallel & Scientific Computations
A new approach to constructing optimal parallel prefix circuits with small depth
Journal of Parallel and Distributed Computing
Families of FPGA-Based Algorithms for Approximate String Matching
ASAP '04 Proceedings of the Application-Specific Systems, Architectures and Processors, 15th IEEE International Conference
High-Speed Parallel-Prefix VLSI Ling Adders
IEEE Transactions on Computers
Reconfigurable Parallel Approximate String Matching on FPGAs
DSD '05 Proceedings of the 8th Euromicro Conference on Digital System Design
On the construction of zero-deficiency parallel prefix circuits with minimum depth
ACM Transactions on Design Automation of Electronic Systems (TODAES)
On-line adaptive parallel prefix computation
Euro-Par'06 Proceedings of the 12th international conference on Parallel Processing
A parallel algorithm for finding all successive minimal maximum subsequences
LATIN'06 Proceedings of the 7th Latin American conference on Theoretical Informatics
Parallel prefix algorithms on the multicomputer
WSEAS Transactions on Computer Research
New parallel prefix algorithms
AIC'09 Proceedings of the 9th WSEAS international conference on Applied informatics and communications
New families of computation-efficient parallel prefix algorithms
WSEAS Transactions on Computers
Hi-index | 0.00 |
This paper presents the design and implementation of an efficient reconfigurable parallel prefix computation hardware on field-programmable gate arrays (FPGAs). The design is based on a pipelined dataflow algorithm, and control logic is added to reconfigure the system for arbitrary parallelism degree. The system receives multiple input streams of elements in parallel and produces output streams in parallel. It has an advantage of controlling the degree of parallelism explicitly at run time. The time complexity of the design is O(d+(N驴d)/d), where d and N are parallelism degree and stream size, respectively. When the stream size is sufficiently larger than the initial trigger time of the pipeline (d), the time complexity becomes O(N/d). Unlike the prefix computation circuits found in the literature, the design is scalable for different problem sizes including unknown sized data. The design is modular based on a finite state machine, and implemented and tested for target FPGA devices Xilinx Spartan2S XC2S300EFT256-6Q and XC2S600EFG676-6.