A scalable, clustered SMT processor for digital signal processing
MEDEA '03 Proceedings of the 2003 workshop on MEmory performance: DEaling with Applications , systems and architecture
Contentions-conscious dynamic but deterministic scheduling of computational and communication tasks
Proceedings of the 2006 ACM symposium on Applied computing
Contentions-conscious dynamic but deterministic scheduling of computational and communication tasks
Proceedings of the 2006 ACM symposium on Applied computing
Journal of Signal Processing Systems - Special Issue: Embedded computing systems for DSP
SAMOS'06 Proceedings of the 6th international conference on Embedded Computer Systems: architectures, Modeling, and Simulation
Hi-index | 0.00 |
This paper presents the study of running several core multimedia applications on a simultaneous multithreading (SMT) architecture and derives design principles for multimedia software engineering. The multimedia workloads range from memory to computational-bounded kernels. A performance metric to evaluate effective SMT performance gain is introduced, and compared to similar metrics on symmetric multiprocessor (SMP) systems. In addition, we analyze and compare SMT versus SMP systems, and highlight the advantages in the studied applications. The results indicate that sharing the cache in SMT processors can provide better cache locality and thus better performance although sharing the cache can introduce cache conflicts and reduce the actual cache size available for each logical processor. We also propose "mutually beneficial prefetching" 驴 a technique toschedule threads so that they prefetch data for each other in order to reduce cache miss penalty.