Strategies for cache and local memory management by global program transformation
Journal of Parallel and Distributed Computing - Special Issue on Languages, Compilers and environments for Parallel Programming
MPEG: a video compression standard for multimedia applications
Communications of the ACM - Special issue on digital multimedia systems
Overview of the p×64 kbit/s video coding standard
Communications of the ACM - Special issue on digital multimedia systems
ASPLOS IV Proceedings of the fourth international conference on Architectural support for programming languages and operating systems
The cache performance and optimizations of blocked algorithms
ASPLOS IV Proceedings of the fourth international conference on Architectural support for programming languages and operating systems
An architecture for software-controlled data prefetching
ISCA '91 Proceedings of the 18th annual international symposium on Computer architecture
An effective on-chip preloading scheme to reduce data access penalty
Proceedings of the 1991 ACM/IEEE conference on Supercomputing
Design and evaluation of a compiler algorithm for prefetching
ASPLOS V Proceedings of the fifth international conference on Architectural support for programming languages and operating systems
Performance of a software MPEG video decoder
MULTIMEDIA '93 Proceedings of the first ACM international conference on Multimedia
A performance study of software and hardware data prefetching schemes
ISCA '94 Proceedings of the 21st annual international symposium on Computer architecture
Tile size selection using cache organization and data layout
PLDI '95 Proceedings of the ACM SIGPLAN 1995 conference on Programming language design and implementation
Thread scheduling for cache locality
Proceedings of the seventh international conference on Architectural support for programming languages and operating systems
Communications of the ACM
The interaction of software prefetching with ILP processors in shared-memory systems
Proceedings of the 24th annual international symposium on Computer architecture
Computer organization and design (2nd ed.): the hardware/software interface
Computer organization and design (2nd ed.): the hardware/software interface
Optimizing the data cache performance of a software MPEG-2 video decoder
MULTIMEDIA '97 Proceedings of the fifth ACM international conference on Multimedia
Tolerating latency in multiprocessors through compiler-inserted prefetching
ACM Transactions on Computer Systems (TOCS)
Performance of image and video processing with general-purpose processors and media ISA extensions
ISCA '99 Proceedings of the 26th annual international symposium on Computer architecture
ISCA '90 Proceedings of the 17th annual international symposium on Computer Architecture
ACM Computing Surveys (CSUR)
ACM Computing Surveys (CSUR)
Hardware Prefetching Techniques for Cache Memories in Multimedia Applications
CAMP '00 Proceedings of the Fifth IEEE International Workshop on Computer Architectures for Machine Perception (CAMP'00)
MPEG video decoding with the UltraSPARC visual instruction set
COMPCON '95 Proceedings of the 40th IEEE Computer Society International Conference
Realtime MPEG video via software decompression on a PA-RISC processor
COMPCON '95 Proceedings of the 40th IEEE Computer Society International Conference
Improving Performance for Software MPEG Players
COMPCON '96 Proceedings of the 41st IEEE International Computer Conference
An Automated Method for Software Controlled Cache Prefetching
HICSS '98 Proceedings of the Thirty-First Annual Hawaii International Conference on System Sciences-Volume 7 - Volume 7
A Comparison of Hardware Prefetching Techniques for Multimedia Benchmarks
ICMCS '96 Proceedings of the 1996 International Conference on Multimedia Computing and Systems
EVALUATING AND IMPROVING PERFORMANCE OF MULTIMEDIA APPLICATIONS ON SIMULTANEOUS MULTI-THREADING
ICPADS '02 Proceedings of the 9th International Conference on Parallel and Distributed Systems
Aspects of cache memory and instruction buffer performance
Aspects of cache memory and instruction buffer performance
Exploiting Cache in Multimedia
ICMCS '99 Proceedings of the IEEE International Conference on Multimedia Computing and Systems - Volume 2
MMX-Based DCT and MC Algorithms for Real-Time Pure Software MPEG Decoding
ICMCS '99 Proceedings of the IEEE International Conference on Multimedia Computing and Systems - Volume 2
Hardware and software cache prefetching techniques for MPEG benchmarks
IEEE Transactions on Circuits and Systems for Video Technology
MediaAlert - a broadcast video monitoring and alerting system for mobile users
Proceedings of the 3rd international conference on Mobile systems, applications, and services
Hi-index | 0.00 |
Pure software HDTV video decoding is still a challenging task on entry-level to mid-range desktop and notebook PCs, even with today's microprocessors frequency measured in GHz. This paper shows that the performance bottleneck in a software MPEG-2 decoder has been shifted to memory operations, as microprocessor technologies including multimedia instruction extensions have been improving at a fast rate during the past years.Our study exploits concurrencies at macroblock level to alleviate the performance bottleneck in a software MPEG-2 decoder. First, the paper introduces an interleaved block-order data layout to improve CPU cache performance. Second, the paper describes an algorithm to explicitly prefetch macroblocks for motion compensation. Finally, the paper presents an algorithm to schedule interleaved decoding and output at macroblock level. Our implementation and experiments show that these methods can effectively hide the latency of memory and frame buffer. The optimizations improve the performance of a multimedia-instruction-optimized software MPEG-2 decoder by a factor of about two. On a PC with a 933 MHz Pentium III CPU, the decoder can decode and display 1280 脳 720-resolution HDTV streams at over 62 frames per second.