An elementary processor architecture with simultaneous instruction issuing from multiple threads
ISCA '92 Proceedings of the 19th annual international symposium on Computer architecture
The accuracy of trace-driven simulations of multiprocessors
SIGMETRICS '93 Proceedings of the 1993 ACM SIGMETRICS conference on Measurement and modeling of computer systems
Simultaneous multithreading: maximizing on-chip parallelism
ISCA '95 Proceedings of the 22nd annual international symposium on Computer architecture
ISCA '96 Proceedings of the 23rd annual international symposium on Computer architecture
A shading language on graphics hardware: the pixelflow shading system
Proceedings of the 25th annual conference on Computer graphics and interactive techniques
MICRO 31 Proceedings of the 31st annual ACM/IEEE international symposium on Microarchitecture
Evaluating MMX technology using DSP and multimedia applications
MICRO 31 Proceedings of the 31st annual ACM/IEEE international symposium on Microarchitecture
Performance of image and video processing with general-purpose processors and media ISA extensions
ISCA '99 Proceedings of the 26th annual international symposium on Computer architecture
Exploiting SIMD parallelism in DSP and multimedia algorithms using the AltiVec technology
ICS '99 Proceedings of the 13th international conference on Supercomputing
Designing and Programming the Emotion Engine
IEEE Micro
PopSPY: A PowerPC Instrumentation Tool for Multiprocessor Simulation
Euro-Par '99 Proceedings of the 5th International Euro-Par Conference on Parallel Processing
A Parallel Algorithm for 3D Geometry Transformations in OpenGL
Euro-Par '99 Proceedings of the 5th International Euro-Par Conference on Parallel Processing
Performance Study of a Multithreaded Superscalar Microprocessor
HPCA '96 Proceedings of the 2nd IEEE Symposium on High-Performance Computer Architecture
MPEG-2 Video Decompression on Simultaneous Multithreaded Multimedia Processors
PACT '99 Proceedings of the 1999 International Conference on Parallel Architectures and Compilation Techniques
A Fine-Grain Multithreading Superscalar Architecture
PACT '96 Proceedings of the 1996 Conference on Parallel Architectures and Compilation Techniques
Exploiting thread-level parallelism on simultaneous multithreaded processors
Exploiting thread-level parallelism on simultaneous multithreaded processors
Simultaneous Multithreaded Vector Architecture: Merging ILP and DLP for High Performance
HIPC '97 Proceedings of the Fourth International Conference on High-Performance Computing
WCAE '98 Proceedings of the 1998 workshop on Computer architecture education
Hi-index | 0.01 |
The characteristics of multimedia applications when executed oil general-purpose processors are not well understood. Such knowledge is extremely important in guiding the development of multimedia applications and the design of future processors.In this paper, we characterize and optimize the performance of multimedia applications on superscalar processor exploiting data-level parallelism and thread-level parallelism with SIMD (Single Instruction Multiple Data) and SMT (Simultaneous MultiThreading) capacities. We show that SMT and SIMD superscalar processor is suitable for 3D geometry application and we characterize the execution in term of memory hierarchy, which is the main bottleneck. The results show that the latency is not fully recovered by SMT; the use of second-level data prefetching does not succeed in increasing the performance.With detailed analysis, we show that this problem comes from a pollution of the instruction window by the threads experiencing second-level cache misses, thus reducing the window available for the other threads. We thus propose a hardware mechanism (an architecture optimization) to predict second-level misses and control this pollution.