LCR '04 Proceedings of the 7th workshop on Workshop on languages, compilers, and run-time support for scalable systems
Design of a high resolution soft real-time timer under a Win32 operating
SAICSIT '05 Proceedings of the 2005 annual research conference of the South African institute of computer scientists and information technologists on IT research in developing countries
Parallel Scalability of Video Decoders
Journal of Signal Processing Systems
Effect of optimizations on performance of OpenMP programs
HiPC'04 Proceedings of the 11th international conference on High Performance Computing
Process fragmentation, distribution and execution using an event-based interaction scheme
Journal of Systems and Software
Hi-index | 0.00 |
Processors with Hyper-Threading technology can improve the performance of applications by permitting a single processor to process data as if it were two processors by executing instructions from different threads in parallel rather than serially. However, the potential performance improvement can be only obtained if an application is multithreaded by parallelization techniques. This paper presents the threaded code generation and optimization techniques in the Intel C++/Fortran compiler. We conduct the performance study of two multimedia applications parallelized with OpenMP pragmas and compiled with the Intel compiler on the Hyper-Threading technology (HT) enabled Intel single-processor and multi-processor systems. Our performance results show that the multithreaded code generated by the Intel compiler achieved up to 1.28x speedups on a HT-enabled single-CPU system and up to 2.23x speedup on a HT-enabled dual-CPU system. By measuring IPC (Instructions Per Cycle), UPC (Uops Per Cycle) and cache misses of both serial and multithreaded execution of each multimedia application, we conclude three key observations: (a) the multithreaded code generated by the Intel compiler yields a good performance gain with the parallelization guided by OpenMP pragmas or directives; (b) exploiting thread-level parallelism (TLP) causes inter-thread interference in caches, and places greater demands on memory system. However, with the Hyper-Threading technology hides the additional latency, so that there is only a small impact on the whole program performance; (c) Hyper-Threading technology is effective on exploiting both task-parallelism and data-parallelism inherent in multimedia applications.