VLSI array processors
Communications of the ACM
A Theory for Multiresolution Signal Decomposition: The Wavelet Representation
IEEE Transactions on Pattern Analysis and Machine Intelligence
The effect of time constraints on scaled speedup
SIAM Journal on Scientific and Statistical Computing
Another view on parallel speedup
Proceedings of the 1990 ACM/IEEE conference on Supercomputing
Ten lectures on wavelets
A class of bases in L2 for the sparse representations of integral operators
SIAM Journal on Mathematical Analysis
Wavelet-like bases for the fast solutions of second-kind integral equations
SIAM Journal on Scientific Computing
Introduction to parallel computing: design and analysis of algorithms
Introduction to parallel computing: design and analysis of algorithms
Multirate systems and filter banks
Multirate systems and filter banks
Polynomial splines and wavelets: a signal processing perspective
Wavelets: a tutorial in theory and applications
Analyzing scalability of parallel algorithms and architectures
Journal of Parallel and Distributed Computing - Special issue on scalability of parallel algorithms and architectures
ComPaSS: a communication package for scalable software design
Journal of Parallel and Distributed Computing - Special issue on scalability of parallel algorithms and architectures
IEEE Spectrum
Scalability of Parallel Algorithm-Machine Combinations
IEEE Transactions on Parallel and Distributed Systems
Concurrent Processing of Linearly Ordered Data Structures on Hypercube Multicomputers
IEEE Transactions on Parallel and Distributed Systems
Parallel architectures and algorithms for discrete wavelet transforms
Parallel architectures and algorithms for discrete wavelet transforms
IEEE Transactions on Signal Processing
IEEE Transactions on Signal Processing
Approaches to zerotree image and video coding on MIMD architectures
Parallel Computing - Parallel computing in image and video processing
Wavelet packet image decomposition on MIMD Architectures
Real-Time Imaging
Efficient Wavelet-Based Video Coding
IPDPS '02 Proceedings of the 16th International Parallel and Distributed Processing Symposium
Hardware and Software Aspects for 3-D Wavelet Decomposition on Shared Memory MIMD Computers
ParNum '99 Proceedings of the 4th International ACPC Conference Including Special Tracks on Parallel Numerics and Parallel Computing in Image Processing, Video Processing, and Multimedia: Parallel Computation
Comparison of PVM and MPI on SGI Multiprocessors in a High Bandwidth Multimedia Application
Proceedings of the 6th European PVM/MPI Users' Group Meeting on Recent Advances in Parallel Virtual Machine and Message Passing Interface
High performance JPEG 2000 and MPEG-4 VTC on SMPs using OpenMP
Parallel Computing - OpenMp
Hi-index | 0.00 |
The ability of a parallel algorithmto make efficient use of increasing computational resources isknown as its scalability. In this paper, we develop four parallelalgorithms for the 2-dimensional Discrete Wavelet Transform algorithm(2-D DWT), and derive their scalability properties on Mesh andHypercube interconnection networks. We consider two versionsof the 2-D DWT algorithm, known as the Standard (S) and Non-standard(NS) forms, mapped onto P processors under two datapartitioning schemes, namely checkerboard (CP) and stripped(SP) partitioning. The two checkerboard partitioned algorithmson the cut-through-routed (CT-routed) Mesh are scalable as M^{2}=\Omega(P\log P) (Non-standard form, NS-CP),and as M^{2}=\Omega(P\log^{2}P) (Standard form,S-CP); while on the store-and-forward-routed (SF-routed) Meshand Hypercube they are scalable as M^2=\Omega(P^{\frac{3}{3-\gamma}})(NS-CP), and as M^2=\Omega(P^{\frac{2}{2-\gamma}})(S-CP), respectively, where M^{2} is the numberof elements in the input matrix, and \gamma\in (0,1)is a parameter relating M to the number of desiredoctaves J as J=\lceil \gamma \log M \rceil.On the CT-routed Hypercube, scalability of the NS-form algorithmsshows similar behavior as on the CT-routed Mesh. The Standardform algorithm with stripped partitioning (S-SP) is scalableon the CT-routed Hypercube as M^{2}=\Omega(P^{2}),and it is unscalable on the CT-routed Mesh. Although asymptoticallythe stripped partitioned algorithm S-SP on the CT-routed Hypercubewould appear to be inferior to its checkerboard counterpart S-CP,detailed analysis based on the proportionality constants of theisoefficiency function shows that S-SP is actually more efficientthan S-CP over a realistic range of machine and problem sizes.A milder form of this result holds on the CT- and SF-routed Mesh,where S-SP would, asymptotically, appear to be altogether unscalable.