Pipelined data parallel algorithms—concept and modeling
ICS '88 Proceedings of the 2nd international conference on Supercomputing
Compiler optimizations for Fortran D on MIMD distributed-memory machines
Proceedings of the 1991 ACM/IEEE conference on Supercomputing
A new model for integrated nested task and data parallel programming
PPOPP '97 Proceedings of the sixth ACM SIGPLAN symposium on Principles and practice of parallel programming
Available paralellism in video applications
MICRO 30 Proceedings of the 30th annual ACM/IEEE international symposium on Microarchitecture
Proceedings of the SIGCHI Conference on Human Factors in Computing Systems
A bandwidth-efficient architecture for media processing
MICRO 31 Proceedings of the 31st annual ACM/IEEE international symposium on Microarchitecture
Space-time memory: a parallel programming abstraction for interactive multimedia applications
Proceedings of the seventh ACM SIGPLAN symposium on Principles and practice of parallel programming
Program Improvement by Source-to-Source Transformation
Journal of the ACM (JACM)
Garbage collection of timestamped data in Stampede
Proceedings of the nineteenth annual ACM symposium on Principles of distributed computing
Proceedings of the 27th annual conference on Computer graphics and interactive techniques
Controlled animation of video sprites
Proceedings of the 2002 ACM SIGGRAPH/Eurographics symposium on Computer animation
Stampede: A Programming System for Emerging Scalable Interactive Multimedia Applications
LCPC '98 Proceedings of the 11th International Workshop on Languages and Compilers for Parallel Computing
Integrated Task and Data Parallel Support for Dynamic Applications
LCR '98 Selected Papers from the 4th International Workshop on Languages, Compilers, and Run-Time Systems for Scalable Computers
CVPR '97 Proceedings of the 1997 Conference on Computer Vision and Pattern Recognition (CVPR '97)
D-Stampede: Distributed Programming System for Ubiquitous Computing
ICDCS '02 Proceedings of the 22 nd International Conference on Distributed Computing Systems (ICDCS'02)
Hi-index | 0.00 |
We explore optimization strategies and resulting performance of two stream-based video applications, video texture and color tracker, on a cluster of SMPs. The two applications are representative of a class of emerging applications, which we call ''stream-based applications'', that are sensitive to both latency of individual results and overall throughput. Such applications require non-trivial parallelization techniques in order to improve both latency and throughput, given that the stream data emanates from a limited set of sources (exactly one in the two applications studied) and that the distribution of the data cannot be done a priori. We suggest techniques that address in a coordinated fashion the problems of data distribution and work partitioning. We believe the two problems are related and need to be addressed together. We have parallelized two applications using the Stampede cluster programming system that provides abstractions for implementing time- and throughput-sensitive applications elegantly and efficiently. For the Video Textures application we show that we can achieve a speedup of 24.26 on a 112 processor cluster. For the Color Tracker application, where latency is more crucial, we identify the extent of data parallelism that ensures that the slowest member of the pipeline is no longer the bottleneck for achieving a decent frame rate.