Simultaneous multithreading: maximizing on-chip parallelism
ISCA '95 Proceedings of the 22nd annual international symposium on Computer architecture
Piranha: a scalable architecture based on single-chip multiprocessing
Proceedings of the 27th annual international symposium on Computer architecture
On-chip communication architecture for OC-768 network processors
Proceedings of the 38th annual Design Automation Conference
Symbiotic jobscheduling for a simultaneous multithreaded processor
ASPLOS IX Proceedings of the ninth international conference on Architectural support for programming languages and operating systems
Computer
Area and System Clock Effects on SMT/CMP Processors
Proceedings of the 2001 International Conference on Parallel Architectures and Compilation Techniques
A Clustered Approach to Multithreaded Processors
IPPS '98 Proceedings of the 12th. International Parallel Processing Symposium on International Parallel Processing Symposium
Single-ISA Heterogeneous Multi-Core Architectures: The Potential for Processor Power Reduction
Proceedings of the 36th annual IEEE/ACM International Symposium on Microarchitecture
Discovering and Exploiting Program Phases
IEEE Micro
Heterogeneous Chip Multiprocessors
Computer
Dynamically configurable shared CMP helper engines for improved performance
ACM SIGARCH Computer Architecture News - Special issue: dasCMP'05
Dynamic thread assignment on heterogeneous multiprocessor architectures
Proceedings of the 3rd conference on Computing frontiers
Improving the performance and power efficiency of shared helpers in CMPs
CASES '06 Proceedings of the 2006 international conference on Compilers, architecture and synthesis for embedded systems
Supporting microthread scheduling and synchronisation in CMPs
International Journal of Parallel Programming
The potential energy efficiency of vector acceleration
Proceedings of the 2006 ACM/IEEE conference on Supercomputing
Proceedings of the 39th Annual IEEE/ACM International Symposium on Microarchitecture
Conjoining soft-core FPGA processors
Proceedings of the 2006 IEEE/ACM international conference on Computer-aided design
Performance/area efficiency in chip multiprocessors with micro-caches
Proceedings of the 4th international conference on Computing frontiers
Accurate branch prediction for short threads
Proceedings of the 13th international conference on Architectural support for programming languages and operating systems
Dynamic power management framework for multi-core portable embedded system
IFMT '08 Proceedings of the 1st international forum on Next-generation multicore/manycore technologies
Hybrid multithreading for VLIW processors
CASES '09 Proceedings of the 2009 international conference on Compilers, architecture, and synthesis for embedded systems
Compiler techniques for reducing data cache miss rate on a multithreaded architecture
HiPEAC'08 Proceedings of the 3rd international conference on High performance embedded architectures and compilers
Software data spreading: leveraging distributed caches to improve single thread performance
PLDI '10 Proceedings of the 2010 ACM SIGPLAN conference on Programming language design and implementation
Area-efficient floorplans and interconnects for homogeneous multi-core architectures
International Journal of High Performance Systems Architecture
Necromancer: enhancing system throughput by animating dead cores
Proceedings of the 37th annual international symposium on Computer architecture
Dynamically managed multithreaded reconfigurable architectures for chip multiprocessors
Proceedings of the 19th international conference on Parallel architectures and compilation techniques
Federation: Boosting per-thread performance of throughput-oriented manycore architectures
ACM Transactions on Architecture and Code Optimization (TACO)
An evaluation of OpenMP on current and emerging multithreaded/multicore processors
IWOMP'05/IWOMP'06 Proceedings of the 2005 and 2006 international conference on OpenMP shared memory parallel programming
Online strategies for high-performance power-aware thread execution on emerging multiprocessors
IPDPS'06 Proceedings of the 20th international conference on Parallel and distributed processing
Coterminous locality and coterminous group data prefetching on chip-multiprocessors
IPDPS'06 Proceedings of the 20th international conference on Parallel and distributed processing
Erasing Core Boundaries for Robust and Configurable Performance
MICRO '43 Proceedings of the 2010 43rd Annual IEEE/ACM International Symposium on Microarchitecture
Data layout for cache performance on a multithreaded architecture
Transactions on high-performance embedded architectures and compilers III
Multicore performance optimization using partner cores
HotPar'11 Proceedings of the 3rd USENIX conference on Hot topic in parallelism
Shared reconfigurable fabric for multi-core customization
Proceedings of the 48th Design Automation Conference
L2-Cache hierarchical organizations for multi-core architectures
ISPA'06 Proceedings of the 2006 international conference on Frontiers of High Performance Computing and Networking
Bahurupi: A polymorphic heterogeneous multi-core architecture
ACM Transactions on Architecture and Code Optimization (TACO) - HIPEAC Papers
Function units sharing between neighbor cores in CMP
ICA3PP'10 Proceedings of the 10th international conference on Algorithms and Architectures for Parallel Processing - Volume Part I
CGO '11 Proceedings of the 9th Annual IEEE/ACM International Symposium on Code Generation and Optimization
Simultaneous branch and warp interweaving for sustained GPU performance
Proceedings of the 39th Annual International Symposium on Computer Architecture
ACM Transactions on Architecture and Code Optimization (TACO)
Composite Cores: Pushing Heterogeneity Into a Core
MICRO-45 Proceedings of the 2012 45th Annual IEEE/ACM International Symposium on Microarchitecture
ACM Transactions on Architecture and Code Optimization (TACO)
The sharing architecture: sub-core configurability for IaaS clouds
Proceedings of the 19th international conference on Architectural support for programming languages and operating systems
Hi-index | 0.00 |
Chip Multiprocessors (CMP) and Simultaneous Multi-threading (SMT) are two approaches that have been proposed to increase processor efficiency. We believe these two approaches are two extremes of a viable spectrum. Between these two extremes, there exists a range of possible architectures, sharing varying degrees of hardware between processors or threads. This paper proposes conjoined-core chip multiprocessing - topologically feasible resource sharing between adjacent cores of a chip multiprocessor to reduce die area with minimal impact on performance and hence improving the overall computational efficiency. It investigates the possible sharing of floating-point units, crossbar ports, instruction caches, and data caches and details the area savings that each kind of sharing entails. It also shows that the negative impact on performance due to sharing is significantly less than the benefits of reduced area. Several novel techniques for intelligent sharing of the hardware resources to minimize performance degradation are presented.