Communications of the ACM - Special section on computer architecture
Cache coherence protocols: evaluation using a multiprocessor simulation model
ACM Transactions on Computer Systems (TOCS)
Hierarchical cache/bus architecture for shared memory multiprocessors
ISCA '87 Proceedings of the 14th annual international symposium on Computer architecture
The Stanford Dash Multiprocessor
Computer
Evaluation of multithreaded uniprocessors for commercial application environments
ISCA '96 Proceedings of the 23rd annual international symposium on Computer architecture
A low-overhead coherence solution for multiprocessors with private cache memories
25 years of the international symposia on Computer architecture (selected papers)
Piranha: a scalable architecture based on single-chip multiprocessing
Proceedings of the 27th annual international symposium on Computer architecture
Route packets, not wires: on-chip inteconnection networks
Proceedings of the 38th annual Design Automation Conference
An adaptive, non-uniform cache structure for wire-delay dominated on-chip caches
Proceedings of the 10th international conference on Architectural support for programming languages and operating systems
Computer
Flow control and micro-architectural mechanisms for extending the performance of interconnection networks
Single-ISA Heterogeneous Multi-Core Architectures: The Potential for Processor Power Reduction
Proceedings of the 36th annual IEEE/ACM International Symposium on Microarchitecture
Design and implementation of the POWER5™ microprocessor
Proceedings of the 41st annual Design Automation Conference
The future of interconnection technology
IBM Journal of Research and Development
A performance methodology for commercial servers
IBM Journal of Research and Development
The circuit and physical design of the POWER4 microprocessor
IBM Journal of Research and Development
Heterogeneous Chip Multiprocessors
Computer
Exploring the cache design space for large scale CMPs
ACM SIGARCH Computer Architecture News - Special issue: dasCMP'05
PowerViP: Soc power estimation framework at transaction level
ASP-DAC '06 Proceedings of the 2006 Asia and South Pacific Design Automation Conference
A Gracefully Degrading and Energy-Efficient Modular Router Architecture for On-Chip Networks
Proceedings of the 33rd annual international symposium on Computer Architecture
Flexible Snooping: Adaptive Forwarding and Filtering of Snoops in Embedded-Ring Multiprocessors
Proceedings of the 33rd annual international symposium on Computer Architecture
Interconnect-Aware Coherence Protocols for Chip Multiprocessors
Proceedings of the 33rd annual international symposium on Computer Architecture
Core architecture optimization for heterogeneous chip multiprocessors
Proceedings of the 15th international conference on Parallel architectures and compilation techniques
Efficiently exploring architectural design spaces via predictive modeling
Proceedings of the 12th international conference on Architectural support for programming languages and operating systems
Design space exploration for multicore architectures: a power/performance/thermal view
Proceedings of the 20th annual international conference on Supercomputing
Support for High-Frequency Streaming in CMPs
Proceedings of the 39th Annual IEEE/ACM International Symposium on Microarchitecture
Coherence Ordering for Ring-based Chip Multiprocessors
Proceedings of the 39th Annual IEEE/ACM International Symposium on Microarchitecture
ViChaR: A Dynamic Virtual Channel Regulator for Network-on-Chip Routers
Proceedings of the 39th Annual IEEE/ACM International Symposium on Microarchitecture
Leveraging Optical Technology in Future Bus-based Chip Multiprocessors
Proceedings of the 39th Annual IEEE/ACM International Symposium on Microarchitecture
Modelling and simulation of off-chip communication architectures for high-speed packet processors
Journal of Systems and Software
Physical aware frequency selection for dynamic thermal management in multi-core systems
Proceedings of the 2006 IEEE/ACM international conference on Computer-aided design
Conjoining soft-core FPGA processors
Proceedings of the 2006 IEEE/ACM international conference on Computer-aided design
CMP cache performance projection: accessibility vs. capacity
ACM SIGARCH Computer Architecture News
Proximity-aware directory-based coherence for multi-core processor architectures
Proceedings of the nineteenth annual ACM symposium on Parallel algorithms and architectures
Proceedings of the 34th annual international symposium on Computer architecture
Rotary router: an efficient architecture for CMP interconnection networks
Proceedings of the 34th annual international symposium on Computer architecture
A novel dimensionally-decomposed router for on-chip communication in 3D architectures
Proceedings of the 34th annual international symposium on Computer architecture
Core fusion: accommodating software diversity in chip multiprocessors
Proceedings of the 34th annual international symposium on Computer architecture
Comparing memory systems for chip multiprocessors
Proceedings of the 34th annual international symposium on Computer architecture
Interconnect design considerations for large NUCA caches
Proceedings of the 34th annual international symposium on Computer architecture
On Characterizing Performance of the Cell Broadband Engine Element Interconnect Bus
NOCS '07 Proceedings of the First International Symposium on Networks-on-Chip
On the Design of a Photonic Network-on-Chip
NOCS '07 Proceedings of the First International Symposium on Networks-on-Chip
INTACTE: an interconnect area, delay, and energy estimation tool for microarchitectural explorations
CASES '07 Proceedings of the 2007 international conference on Compilers, architecture, and synthesis for embedded systems
Efficient architectural design space exploration via predictive modeling
ACM Transactions on Architecture and Code Optimization (TACO)
VEBoC: variation and error-aware design for billions of devices on a chip
Proceedings of the 2008 Asia and South Pacific Design Automation Conference
Embedded processors and systems: Architectural issues and solutions for emerging applications
Journal of Embedded Computing - Embeded Processors and Systems: Architectural Issues and Solutions for Emerging Applications
A consistency architecture for hierarchical shared caches
Proceedings of the twentieth annual symposium on Parallelism in algorithms and architectures
Corona: System Implications of Emerging Nanophotonic Technology
ISCA '08 Proceedings of the 35th Annual International Symposium on Computer Architecture
ISCA '08 Proceedings of the 35th Annual International Symposium on Computer Architecture
Thermal monitoring mechanisms for chip multiprocessors
ACM Transactions on Architecture and Code Optimization (TACO)
The Journal of Supercomputing
Application-specific Processor Architecture: Then and Now
Journal of Signal Processing Systems
Comparative evaluation of memory models for chip multiprocessors
ACM Transactions on Architecture and Code Optimization (TACO)
Computation and data transfer co-scheduling for interconnection bus minimization
Proceedings of the 2009 Asia and South Pacific Design Automation Conference
Adaptive data compression for high-performance low-power on-chip networks
Proceedings of the 41st annual IEEE/ACM International Symposium on Microarchitecture
Tradeoffs in designing accelerator architectures for visual computing
Proceedings of the 41st annual IEEE/ACM International Symposium on Microarchitecture
Power reduction of CMP communication networks via RF-interconnects
Proceedings of the 41st annual IEEE/ACM International Symposium on Microarchitecture
Proceedings of the eleventh international joint conference on Measurement and modeling of computer systems
Evaluating SoC Network Performance in MPEG-4 Encoder
Journal of Signal Processing Systems
Area-efficiency in CMP core design: co-optimization of microarchitecture and physical design
ACM SIGARCH Computer Architecture News
Dealing with Traffic-Area Trade-Off in Direct Coherence Protocols for Many-Core CMPs
APPT '09 Proceedings of the 8th International Symposium on Advanced Parallel Processing Technologies
Synchronizing redundant cores in a dynamic DMR multicore architecture
IEEE Transactions on Circuits and Systems II: Express Briefs
Outstanding research problems in NoC design: system, microarchitecture, and circuit perspectives
IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems
Proceedings of the 42nd Annual IEEE/ACM International Symposium on Microarchitecture
An analysis of on-chip interconnection networks for large-scale chip multiprocessors
ACM Transactions on Architecture and Code Optimization (TACO)
Reducing signal timing variations in inter-core busses
Integration, the VLSI Journal
HiPC'07 Proceedings of the 14th international conference on High performance computing
Constraint-aware large-scale CMP cache design
HiPC'07 Proceedings of the 14th international conference on High performance computing
The SKB: a semi-completely-connected bus for on-chip systems
NPC'07 Proceedings of the 2007 IFIP international conference on Network and parallel computing
HiPC'08 Proceedings of the 15th international conference on High performance computing
Silicon-photonic network architectures for scalable, power-efficient multi-chip systems
Proceedings of the 37th annual international symposium on Computer architecture
Proceedings of the 37th annual international symposium on Computer architecture
Improving the Performance of GALS-Based NoCs in the Presence of Process Variation
NOCS '10 Proceedings of the 2010 Fourth ACM/IEEE International Symposium on Networks-on-Chip
NOCS '10 Proceedings of the 2010 Fourth ACM/IEEE International Symposium on Networks-on-Chip
Branch target buffer design for embedded processors
Microprocessors & Microsystems
Cost-driven 3D integration with interconnect layers
Proceedings of the 47th Design Automation Conference
Low power branch prediction for embedded application processors
Proceedings of the 16th ACM/IEEE international symposium on Low power electronics and design
Journal of Systems Architecture: the EUROMICRO Journal
A methodology for the characterization of process variation in NoC links
Proceedings of the Conference on Design, Automation and Test in Europe
A study of the on-chip interconnection network for the IBM Cyclops64 multi-core architecture
IPDPS'06 Proceedings of the 20th international conference on Parallel and distributed processing
Design of a performance enhanced and power reduced dual-crossbar Network-on-Chip (NoC) architecture
Microprocessors & Microsystems
Algorithms for optimally arranging multicore memory structures
EURASIP Journal on Embedded Systems
A workload-adaptive and reconfigurable bus architecture for multicore processors
International Journal of Reconfigurable Computing
Characterizing the impact of process variation on 45 nm NoC-based CMPs
Journal of Parallel and Distributed Computing
Large-scale integrated photonics for high-performance interconnects
ACM Journal on Emerging Technologies in Computing Systems (JETC)
A case for globally shared-medium on-chip interconnect
Proceedings of the 38th annual international symposium on Computer architecture
L2-Cache hierarchical organizations for multi-core architectures
ISPA'06 Proceedings of the 2006 international conference on Frontiers of High Performance Computing and Networking
A high efficient on-chip interconnection network in SIMD CMPs
ICA3PP'10 Proceedings of the 10th international conference on Algorithms and Architectures for Parallel Processing - Volume Part I
Packet chaining: efficient single-cycle allocation for on-chip networks
Proceedings of the 44th Annual IEEE/ACM International Symposium on Microarchitecture
Complementing user-level coarse-grain parallelism with implicit speculative parallelism
Proceedings of the 44th Annual IEEE/ACM International Symposium on Microarchitecture
Low-Overhead, high-speed multi-core barrier synchronization
HiPEAC'10 Proceedings of the 5th international conference on High Performance Embedded Architectures and Compilers
Using partial tag comparison in low-power snoop-based chip multiprocessors
ISCA'10 Proceedings of the 2010 international conference on Computer Architecture
Proceedings of the 26th ACM international conference on Supercomputing
Adaptive dynamic frequency scaling for thermal-aware 3d multi-core processors
ICCSA'12 Proceedings of the 12th international conference on Computational Science and Its Applications - Volume Part IV
Energy-guided exploration of on-chip network design for exa-scale computing
Proceedings of the International Workshop on System Level Interconnect Prediction
Data transfers on the fly for hierarchical systems of chip multi-processors
PPAM'11 Proceedings of the 9th international conference on Parallel Processing and Applied Mathematics - Volume Part I
Stream arbitration: Towards efficient bandwidth utilization for emerging on-chip interconnects
ACM Transactions on Architecture and Code Optimization (TACO) - Special Issue on High-Performance Embedded Architectures and Compilers
ACM Transactions on Architecture and Code Optimization (TACO)
Characterization and cost-efficient selection of NoC topologies for general purpose CMPs
Proceedings of the 2013 Interconnection Network Architecture: On-Chip, Multi-Chip
New heuristic algorithms for low-energy mapping and routing in 3D NoC
International Journal of Computer Applications in Technology
Dynamic cache management in multi-core architectures through run-time adaptation
DATE '12 Proceedings of the Conference on Design, Automation and Test in Europe
PACT '13 Proceedings of the 22nd international conference on Parallel architectures and compilation techniques
Hardware support for accurate per-task energy metering in multicore systems
ACM Transactions on Architecture and Code Optimization (TACO)
Hi-index | 0.00 |
This paper examines the area, power, performance, and design issues for the on-chip interconnects on a chip multiprocessor, attempting to present a comprehensive view of a class of interconnect architectures. It shows that the design choices for the interconnect have significant effect on the rest of the chip, potentially consuming a significant fraction of the real estate and power budget. This research shows that designs that treat interconnect as an entity that can be independently architected and optimized would not arrive at the best multi-core design. Several examples are presented showing the need for careful co-design. For instance, increasing interconnect bandwidth requires area that then constrains the number of cores or cache sizes, and does not necessarily increase performance. Also, shared level-2 caches become significantly less attractive when the overhead of the resulting crossbar is accounted for. A hierarchical bus structure is examined which negates some of the performance costs of the assumed base-line architecture.