The VEGA Moderately Parallel MIMD, Moderately Parallel SIMD, Architecture for High Performance Array Signal Processing

Authors:
Affiliations:
Venue:
IPPS '98 Proceedings of the 12th. International Parallel Processing Symposium on International Parallel Processing Symposium
Year:
1998

Citing 5
Cited 1

An Area/Performance Comparison of Subtractive and Multiplicative Divide/Square Root Implementations

ARITH '95 Proceedings of the 12th Symposium on Computer Arithmetic
Time-deterministic WDM star network for massively parallel computing in radar systems

MPPOI '96 Proceedings of the 3rd Conference on Massively Parallel Processing Using Optical Interconnections
Fiber-Ribbon Pipeline Ring Network for High-Performance Distributed Computing Systems

ISPAN '97 Proceedings of the 1997 International Symposium on Parallel Architectures, Algorithms and Networks
Suggestions for implementing a fast IEEE multiply-add-fused instruction

Suggestions for implementing a fast IEEE multiply-add-fused instruction
An Analysis of Division Algorithms and Implementations

An Analysis of Division Algorithms and Implementations

Performance Evaluation of Selected Job Management Systems

IPDPS '02 Proceedings of the 16th International Parallel and Distributed Processing Symposium

Quantified Score

Hi-index	0.00

Visualization

Abstract

In array radar signal processing applications, the processing demands range from tens of GFLOPS to several TFLOPS. To address this, as well as the, size and power dissipation issues, a special purpose "array signal processing" architecture is proposed. We argue that a combined MIMD-SIMD system can give flexibility, scalability, and programmability as well as high computing density. The MIMD system level, where SIMD modules are interconnected by a fiber-optic real-time network, provides the high level flexibility while the SIMD module level provides the compute density. In this paper we evaluate different design alternatives and show how the VEGA architecture was derived. By examining the applications and the algorithms used, the SIMD mesh processor is found be sufficient. However, the smaller the meshes are the better is the flexibility and efficiency. Then, based on prototype VLSI implementations and on instruction statistics, we find that a relatively large pipelined processing element maximises the performance per area. It is thereby concluded that the small SIMD mesh processor array with powerful processing elements is the best choice. These observations are further exploited in the design of the single-chip SIMD processor array to be included in the MIMD-style overall system. The system scales from 6.4 GFLOPS to several TFLOPS peak performance.