Toward a dataflow/von Neumann hybrid architecture
ISCA '88 Proceedings of the 15th Annual International Symposium on Computer architecture
Executing a Program on the MIT Tagged-Token Dataflow Architecture
IEEE Transactions on Computers
ISCA '95 Proceedings of the 22nd annual international symposium on Computer architecture
ARB: A Hardware Mechanism for Dynamic Reordering of Memory References
IEEE Transactions on Computers
ISCA '96 Proceedings of the 23rd annual international symposium on Computer architecture
Dynamic speculation and synchronization of data dependences
Proceedings of the 24th annual international symposium on Computer architecture
Simultaneous subordinate microthreading (SSMT)
ISCA '99 Proceedings of the 26th annual international symposium on Computer architecture
Piranha: a scalable architecture based on single-chip multiprocessing
Proceedings of the 27th annual international symposium on Computer architecture
Computer
Speculative Data-Driven Multithreading
HPCA '01 Proceedings of the 7th International Symposium on High-Performance Computer Architecture
Architectural differences of efficient sequential and parallel computers
Journal of Systems Architecture: the EUROMICRO Journal
Run-Time Support for the Automatic Parallelization of Java Programs
The Journal of Supercomputing
Chip multi-processor scalability for single-threaded applications
ACM SIGARCH Computer Architecture News - Special issue: dasCMP'05
Proceedings of the 33rd annual international symposium on Computer Architecture
Accelerating sequential programs on Chip Multiprocessors via Dynamic Prefetching Thread
Microprocessors & Microsystems
Hi-index | 4.10 |
Although novel functionality in the 1990s played a dominant role in processor design, the authors predict that implementation will dominate over functionality. Designing, debugging, and verifying monolithic designs that use hundreds of millions of transistors will be very difficult, and increasing wire delays will make intrachip communication and clock distribution costly. Consequently, some computer architects advocate shifting from high-performance to high-throughput processing, using distributed components to conquer design-process complexity and exploit communication locality to solve wire delays. Multithreaded architectures can extract parallelism from a sequential program via thread-level speculation, making it flexible to operate in multiple-program, high-throughput and single-program, high-performance environments. Speculation is the key factor. Multithreaded processors that support concurrent execution of multiple threads on a single chip may dominate some application uses in the next decade. Simultaneous multithreading uses monolithic designs with shared resources among the threads. Multithreading seeks to divide programs into data-independent parallel threads. Before speculative multithreading becomes commonplace in mainstream processors, technologies must be developed for conveying thread information from software to hardware. Likewise, algorithms for thread selection and management and hardware and software to support the simultaneous execution of speculative and nonspeculative threads must also be devised.