An elementary processor architecture with simultaneous instruction issuing from multiple threads
ISCA '92 Proceedings of the 19th annual international symposium on Computer architecture
Architecture and C++-programming environment of a highly parallel image signal processor
Microprocessing and Microprogramming - Special issue: parallel programmable architectures and compilation
Simultaneous multithreading: maximizing on-chip parallelism
ISCA '95 Proceedings of the 22nd annual international symposium on Computer architecture
ISCA '96 Proceedings of the 23rd annual international symposium on Computer architecture
Evaluation of multithreaded uniprocessors for commercial application environments
ISCA '96 Proceedings of the 23rd annual international symposium on Computer architecture
ICS '90 Proceedings of the 4th international conference on Supercomputing
Scalable instruction-level parallelism through tree-instructions
ICS '97 Proceedings of the 11th international conference on Supercomputing
Computer architecture (2nd ed.): a quantitative approach
Computer architecture (2nd ed.): a quantitative approach
Advanced Computer Architecture: Parallelism,Scalability,Programmability
Advanced Computer Architecture: Parallelism,Scalability,Programmability
IEEE Micro
Subword Parallelism with MAX-2
IEEE Micro
ICASSP '97 Proceedings of the 1997 IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP '97) -Volume 1 - Volume 1
IEEE Transactions on Circuits and Systems for Video Technology
Associative controlling of monolithic parallel processor architectures
IEEE Transactions on Circuits and Systems for Video Technology
A scalable, clustered SMT processor for digital signal processing
MEDEA '03 Proceedings of the 2003 workshop on MEmory performance: DEaling with Applications , systems and architecture
Journal of Signal Processing Systems - Special Issue: Embedded computing systems for DSP
SAMOS'06 Proceedings of the 6th international conference on Embedded Computer Systems: architectures, Modeling, and Simulation
Hi-index | 0.00 |
Very Long Instruction Word (VLIW) processor architecturesfor multimedia applications are discussed from an algorithm, hardwareand system based point of view. VLIW processors show high flexibilityand processing power, as well as a good utilization of resources bycompiler-generated code, but their exclusive exploitation ofinstruction level parallelism (ILP) decreases in efficiency as thedegree of parallelism increases. This is mainly caused bycharacteristics of multimedia algorithms, increasing wiring delays,compiler restrictions, and a widening gap between on-chip processingspeed and available bandwidth to external memory. As new multimediaapplications and standards continue to evolve (MPEG-4), the demandfor higher processing power will continue. Therefore, parallelprocessing in all its available forms will have to be exploited toachieve significant performance improvements. We show that, due tothe diminishing returns from a further increase in ILP, multimediaapplications will benefit more from an additional exploitation ofparallelism at thread-level. We examine how simultaneousmultithreading (SMT), a novel architectural approach combining VLIWtechniques with parallel processing of threads, can efficiently beused to further increase performance of typical multimedia workloads.