The Scalability of FFT on Parallel Computers

Authors:
A. Gupta;V. Kumar
Affiliations:
-;-
Venue:
IEEE Transactions on Parallel and Distributed Systems
Year:
1993

Citing 20
Cited 27

Designing efficient algorithms for parallel computers

Designing efficient algorithms for parallel computers
Parallelization and Performance Analysis of the Cooley-Tukey FFT Algorithm for Shared-Memory Architectures

IEEE Transactions on Computers
Performance analysis of the FFT algorithm on a shared-memory parallel architecture

IBM Journal of Research and Development
Reevaluating Amdahl's law

Communications of the ACM
Parallel depth first search. Part II. analysis

International Journal of Parallel Programming
Speedup Versus Efficiency in Parallel Systems

IEEE Transactions on Computers
The design and analysis of parallel algorithms

The design and analysis of parallel algorithms
Measuring the scalability of parallel computer systems

Proceedings of the 1989 ACM/IEEE conference on Supercomputing
Optimum Broadcasting and Personalized Communication in Hypercubes

IEEE Transactions on Computers
Measuring parallel processor performance

Communications of the ACM
The effect of time constraints on scaled speedup

SIAM Journal on Scientific and Statistical Computing
FFTs in external or hierarchical memory

The Journal of Supercomputing
Scalability of parallel machines

Communications of the ACM
Hypercube algorithms: with applications to image processing and pattern recognition

Hypercube algorithms: with applications to image processing and pattern recognition
Scalability of parallel algorithms for the all-pairs shortest-path problem

Journal of Parallel and Distributed Computing
Computational frameworks for the fast Fourier transform

Computational frameworks for the fast Fourier transform
Introduction to parallel computing: design and analysis of algorithms

Introduction to parallel computing: design and analysis of algorithms
A VLSI Architecture for Concurrent Data Structures

A VLSI Architecture for Concurrent Data Structures
The Design and Analysis of Computer Algorithms

The Design and Analysis of Computer Algorithms
Experimental Application-Driven Architecture Analysis of an SIMD/MIMD Parallel Processing System

IEEE Transactions on Parallel and Distributed Systems

Scalability analysis of partitioning strategies for finite element graphs: a summary of results

Proceedings of the 1992 ACM/IEEE conference on Supercomputing
Unstructured tree search on SIMD parallel computers: a summary of results

Proceedings of the 1992 ACM/IEEE conference on Supercomputing
Performance and Scalability of Preconditioned Conjugate Gradient Methods on Parallel Computers

IEEE Transactions on Parallel and Distributed Systems
Future applicability of bus-based shared memory multiprocessors

Proceedings of the seventh annual ACM symposium on Parallel algorithms and architectures
On characterizing bandwidth requirements of parallel applications

Proceedings of the 1995 ACM SIGMETRICS joint international conference on Measurement and modeling of computer systems
Are there advantages to high-dimension architectures?: Analysis of k-ary n-cubes for the class of parallel divide-and-conquer algorithms

ICS '96 Proceedings of the 10th international conference on Supercomputing
An Analytical Method for Predicting the Performance of Parallel Image Processing Operations

The Journal of Supercomputing
An Application-Driven Study of Parallel System Overheads and Network Bandwidth Requirements

IEEE Transactions on Parallel and Distributed Systems
Parallel Computing on an Ethernet Cluster of Workstations: Opportunities and Constraints

The Journal of Supercomputing
Parallel FFT on ATM-based networks of workstations

Cluster Computing
Isoefficiency: Measuring the Scalability of Parallel Algorithms and Architectures

IEEE Parallel & Distributed Technology: Systems & Technology
Unstructured Tree Search on SIMD Parallel Computers

IEEE Transactions on Parallel and Distributed Systems
A Scalable Parallel Formulation of the Backpropagation Algorithm for Hypercubes and Related Architectures

IEEE Transactions on Parallel and Distributed Systems
Time-Sharing Parallel Jobs in the Presence of Multiple Resource Requirements

IPDPS '00/JSSPP '00 Proceedings of the Workshop on Job Scheduling Strategies for Parallel Processing
A Methodology for User-Oriented Scalability Analysis

ASAP '97 Proceedings of the IEEE International Conference on Application-Specific Systems, Architectures and Processors
Two parallel implementations for one dimension FFT on symmetric multiprocessors

ACM-SE 42 Proceedings of the 42nd annual Southeast regional conference
Parallel Distributed FFT-Based Solvers for 3-D Poisson Problems in Meso-Scale Atmospheric Simulations

International Journal of High Performance Computing Applications
The "Invaders' Algorithm: Range of Values Modulation for Accelerated Correlation

IEEE Transactions on Pattern Analysis and Machine Intelligence
Efficient Adaptive Algorithms for Transposing Small and Large Matrices on Symmetric Multiprocessors

Informatica
Parallel implementations of 1-D fast Fourier transform without interprocessor communication

International Journal of Computers and Applications
Using GPUs to compute large out-of-card FFTs

Proceedings of the international conference on Supercomputing
Algorithmic-Parameter optimization of a parallelized split-step fourier transform using a modified BSP cost model

ISPA'04 Proceedings of the Second international conference on Parallel and Distributed Processing and Applications
Modelling and analysis of communication overhead for parallel matrix algorithms

Mathematical and Computer Modelling: An International Journal
Thread vulnerability in parallel applications

Journal of Parallel and Distributed Computing
FFTs and multiple collective communication on multiprocessor-node architectures

PPAM'11 Proceedings of the 9th international conference on Parallel Processing and Applied Mathematics - Volume Part I
Performance-reliability tradeoff analysis for multithreaded applications

DATE '12 Proceedings of the Conference on Design, Automation and Test in Europe
L24: Parallelism, performance, energy efficiency, and cost trade-offs in future sensor platforms

ACM Transactions on Embedded Computing Systems (TECS)

Quantified Score

Hi-index	0.01

Visualization

Abstract

The authors present the scalability analysis of a parallel fast Fourier transform (FFT)algorithm on mesh and hypercube connected multicomputers using the isoefficiencymetric. The isoefficiency function of an algorithm architecture combination is defined asthe rate at which the problem size should grow with the number of processors to maintaina fixed efficiency. It is shown that it is more cost-effective to implement the FFTalgorithm on a hypercube rather than a mesh despite the fact that large scale meshesare cheaper to construct than large hypercubes. Although the scope of this work islimited to the Cooley-Tukey FFT algorithm on a few classes of architectures, themethodology can be used to study the performance of various FFT algorithms on avariety of architectures such as SIMD hypercube and mesh architectures and sharedmemory architecture.