An Area/Performance Comparison of Subtractive and Multiplicative Divide/Square Root Implementations
ARITH '95 Proceedings of the 12th Symposium on Computer Arithmetic
Time-deterministic WDM star network for massively parallel computing in radar systems
MPPOI '96 Proceedings of the 3rd Conference on Massively Parallel Processing Using Optical Interconnections
Fiber-Ribbon Pipeline Ring Network for High-Performance Distributed Computing Systems
ISPAN '97 Proceedings of the 1997 International Symposium on Parallel Architectures, Algorithms and Networks
Suggestions for implementing a fast IEEE multiply-add-fused instruction
Suggestions for implementing a fast IEEE multiply-add-fused instruction
An Analysis of Division Algorithms and Implementations
An Analysis of Division Algorithms and Implementations
Performance Evaluation of Selected Job Management Systems
IPDPS '02 Proceedings of the 16th International Parallel and Distributed Processing Symposium
Hi-index | 0.00 |
In array radar signal processing applications, the processing demands range from tens of GFLOPS to several TFLOPS. To address this, as well as the, size and power dissipation issues, a special purpose "array signal processing" architecture is proposed. We argue that a combined MIMD-SIMD system can give flexibility, scalability, and programmability as well as high computing density. The MIMD system level, where SIMD modules are interconnected by a fiber-optic real-time network, provides the high level flexibility while the SIMD module level provides the compute density. In this paper we evaluate different design alternatives and show how the VEGA architecture was derived. By examining the applications and the algorithms used, the SIMD mesh processor is found be sufficient. However, the smaller the meshes are the better is the flexibility and efficiency. Then, based on prototype VLSI implementations and on instruction statistics, we find that a relatively large pipelined processing element maximises the performance per area. It is thereby concluded that the small SIMD mesh processor array with powerful processing elements is the best choice. These observations are further exploited in the design of the single-chip SIMD processor array to be included in the MIMD-style overall system. The system scales from 6.4 GFLOPS to several TFLOPS peak performance.