Parallel programming with MPI
Efficient Wavelet-Based Video Coding
IPDPS '02 Proceedings of the 16th International Parallel and Distributed Processing Symposium
Efficient 3d Wavelet Transform Decomposition For Video Compression
DCV '01 Proceedings of the Second International Workshop on Digital and Computational Video
Journal of Parallel and Distributed Computing
A study of I/O methods for parallel visualization of large-scale data
Parallel Computing - Parallel graphics and visualization
Parallel 3-Dimensional DCT Computation on k-Ary n-Cubes
HPCASIA '05 Proceedings of the Eighth International Conference on High-Performance Computing in Asia-Pacific Region
High performance JPEG 2000 and MPEG-4 VTC on SMPs using OpenMP
Parallel Computing - OpenMp
Lossless filter banks based on two point transform and interpolative prediction
ICASSP '99 Proceedings of the Acoustics, Speech, and Signal Processing, 1999. on 1999 IEEE International Conference - Volume 03
A real-time wavelet-domain video denoising implementation in FPGA
EURASIP Journal on Embedded Systems
EURASIP Journal on Applied Signal Processing
Hi-index | 0.00 |
In this paper, we introduce and evaluate the parallel implementations of two video sequences decorrelation algorithms having been developed based on the non-alternating three-dimensional wavelet transform (3D-WT) and the temporal-window method. The proposed algorithms have been proven to outperform the classic 3D-WT algorithm in terms of a better coding efficiency and lower computational requirements while enabling a lossless coding and a top-quality reconstruction: the two most highly relevant features to medical imaging applications. The parallel implementations of the algorithms are developed and tested on a shared memory system, a SGI Origin 3800 supercomputer, making use of a message-passing paradigm. We evaluate and analyze the performance of the implementations in terms of the response time and speed-up factor by varying the number of processors and various video coding parameters. The key point enabling the development of highly efficient implementations rely on a workload distribution strategy supplemented by the use of parallel I/O primitives, for better exploiting the inherent features of the application and computing platform. Two sets of I/O primitives are tested and evaluated: the ones provided by the C compiler and the ones belonging to the MPI/IO library.