Analysis and simulation of a fair queueing algorithm
SIGCOMM '89 Symposium proceedings on Communications architectures & protocols
Virtual clock: a new traffic control algorithm for packet switching networks
SIGCOMM '90 Proceedings of the ACM symposium on Communications architectures & protocols
Comparison of rate-based service disciplines
SIGCOMM '91 Proceedings of the conference on Communications architecture & protocols
ISCA '96 Proceedings of the 23rd annual international symposium on Computer architecture
The iSLIP scheduling algorithm for input-queued switches
IEEE/ACM Transactions on Networking (TON)
Spider: A High-Speed Network Interconnect
IEEE Micro
IEEE Transactions on Parallel and Distributed Systems
Self-Tuned Congestion Control for Multiprocessor Networks
HPCA '01 Proceedings of the 7th International Symposium on High-Performance Computer Architecture
A New Memory Monitoring Scheme for Memory-Aware Scheduling and Partitioning
HPCA '02 Proceedings of the 8th International Symposium on High-Performance Computer Architecture
Proceedings of the conference on Design, automation and test in Europe - Volume 2
Principles and Practices of Interconnection Networks
Principles and Practices of Interconnection Networks
CQoS: a framework for enabling QoS in shared caches of CMP platforms
Proceedings of the 18th annual international conference on Supercomputing
Proceedings of the conference on Design, Automation and Test in Europe - Volume 2
A Quality-of-Service Mechanism for Interconnection Networks in System-on-Chips
Proceedings of the conference on Design, Automation and Test in Europe - Volume 2
Æthereal Network on Chip: Concepts, Architectures, and Implementations
IEEE Design & Test
METERG: Measurement-Based End-to-End Performance Estimation Technique in QoS-Capable Multiprocessors
RTAS '06 Proceedings of the 12th IEEE Real-Time and Embedded Technology and Applications Symposium
Communist, utilitarian, and capitalist cache policies on CMPs: caches as a shared resource
Proceedings of the 15th international conference on Parallel architectures and compilation techniques
Proceedings of the 39th Annual IEEE/ACM International Symposium on Microarchitecture
From chaos to QoS: case studies in CMP resource management
ACM SIGARCH Computer Architecture News
Proceedings of the 34th annual international symposium on Computer architecture
Express virtual channels: towards the ideal interconnection fabric
Proceedings of the 34th annual international symposium on Computer architecture
A Framework for Providing Quality of Service in Chip Multi-Processors
Proceedings of the 40th Annual IEEE/ACM International Symposium on Microarchitecture
Introduction to the cell broadband engine architecture
IBM Journal of Research and Development
Age-based packet arbitration in large-radix k-ary n-cubes
Proceedings of the 2007 ACM/IEEE conference on Supercomputing
Memory performance attacks: denial of memory service in multi-core systems
SS'07 Proceedings of 16th USENIX Security Symposium on USENIX Security Symposium
Achieving predictable performance through better memory controller placement in many-core CMPs
Proceedings of the 36th annual international symposium on Computer architecture
Dynamic performance tuning for speculative threads
Proceedings of the 36th annual international symposium on Computer architecture
Outstanding research problems in NoC design: system, microarchitecture, and circuit perspectives
IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems
Low-cost router microarchitecture for on-chip networks
Proceedings of the 42nd Annual IEEE/ACM International Symposium on Microarchitecture
Preemptive virtual clock: a flexible, efficient, and cost-effective QOS scheme for networks-on-chip
Proceedings of the 42nd Annual IEEE/ACM International Symposium on Microarchitecture
Application-aware prioritization mechanisms for on-chip networks
Proceedings of the 42nd Annual IEEE/ACM International Symposium on Microarchitecture
Proceedings of the fifteenth edition of ASPLOS on Architectural support for programming languages and operating systems
Composing parallel software efficiently with lithe
PLDI '10 Proceedings of the 2010 ACM SIGPLAN conference on Programming language design and implementation
Aérgia: exploiting packet latency slack in on-chip networks
Proceedings of the 37th annual international symposium on Computer architecture
A case for FAME: FPGA architecture model execution
Proceedings of the 37th annual international symposium on Computer architecture
Back Suction: Service Guarantees for Latency-Sensitive On-chip Networks
NOCS '10 Proceedings of the 2010 Fourth ACM/IEEE International Symposium on Networks-on-Chip
RAMP gold: an FPGA-based architecture simulator for multiprocessors
Proceedings of the 47th Design Automation Conference
Approximating age-based arbitration in on-chip networks
Proceedings of the 19th international conference on Parallel architectures and compilation techniques
Tessellation: space-time partitioning in a manycore client OS
HotPar'09 Proceedings of the First USENIX conference on Hot topics in parallelism
Efficient throughput-guarantees for latency-sensitive networks-on-chip
Proceedings of the 2010 Asia and South Pacific Design Automation Conference
Thread criticality support in on-chip networks
Proceedings of the Third International Workshop on Network on Chip Architectures
LOFT: A High Performance Network-on-Chip Providing Quality-of-Service Support
MICRO '43 Proceedings of the 2010 43rd Annual IEEE/ACM International Symposium on Microarchitecture
Throughput-Effective On-Chip Networks for Manycore Accelerators
MICRO '43 Proceedings of the 2010 43rd Annual IEEE/ACM International Symposium on Microarchitecture
Probabilistic Distance-Based Arbitration: Providing Equality of Service for Many-Core CMPs
MICRO '43 Proceedings of the 2010 43rd Annual IEEE/ACM International Symposium on Microarchitecture
Enabling quality-of-service in nanophotonic network-on-chip
Proceedings of the 16th Asia and South Pacific Design Automation Conference
CoQoS: Coordinating QoS-aware shared resources in NoC-based SoCs
Journal of Parallel and Distributed Computing
Prefetch-aware shared resource management for multi-core systems
Proceedings of the 38th annual international symposium on Computer architecture
Kilo-NOC: a heterogeneous network-on-chip architecture for scalability and service guarantees
Proceedings of the 38th annual international symposium on Computer architecture
Real-time communication analysis for networks with two-stage arbitration
EMSOFT '11 Proceedings of the ninth ACM international conference on Embedded software
Optimal memory controller placement for chip multiprocessor
CODES+ISSS '11 Proceedings of the seventh IEEE/ACM/IFIP international conference on Hardware/software codesign and system synthesis
FeatherWeight: low-cost optical arbitration with QoS support
Proceedings of the 44th Annual IEEE/ACM International Symposium on Microarchitecture
ACM Transactions on Computer Systems (TOCS)
Topology-Aware quality-of-service support in highly integrated chip multiprocessors
ISCA'10 Proceedings of the 2010 international conference on Computer Architecture
RAPA: reliability-aware priority arbitration strategy for network on chip
Proceedings of the great lakes symposium on VLSI
Proceedings of the ACM SIGCOMM 2012 conference on Applications, technologies, architectures, and protocols for computer communication
Dynamically dispatching speculative threads to improve sequential execution
ACM Transactions on Architecture and Code Optimization (TACO)
Dynamic QoS management for chip multiprocessors
ACM Transactions on Architecture and Code Optimization (TACO)
Globally Synchronized Frames for guaranteed quality-of-service in on-chip networks
Journal of Parallel and Distributed Computing
ACM SIGCOMM Computer Communication Review - Special october issue SIGCOMM '12
Addressing End-to-End Memory Access Latency in NoC-Based Multicores
MICRO-45 Proceedings of the 2012 45th Annual IEEE/ACM International Symposium on Microarchitecture
Proceedings of the 40th Annual International Symposium on Computer Architecture
SurfNoC: a low latency and provably non-interfering approach to secure networks-on-chip
Proceedings of the 40th Annual International Symposium on Computer Architecture
Tessellation: refactoring the OS around explicit resource containers with continuous adaptation
Proceedings of the 50th Annual Design Automation Conference
Adaptive virtual channel partitioning for network-on-chip in heterogeneous architectures
ACM Transactions on Design Automation of Electronic Systems (TODAES) - Special Section on Networks on Chip: Architecture, Tools, and Methodologies
Designing on-chip networks for throughput accelerators
ACM Transactions on Architecture and Code Optimization (TACO)
Providing multiple hard latency and throughput guarantees for packet switching networks on chip
Computers and Electrical Engineering
Hi-index | 0.00 |
Future chip multiprocessors (CMPs) may have hundreds to thousands of threads competing to access shared resources, and will require quality-of-service (QoS) support to improve system utilization. Although there has been significant work in QoS support within resources such as caches and memory controllers, there has been less attention paid to QoS support in the multi-hop on-chip networks that will form an important component in future systems. In this paper we introduce Globally-Synchronized Frames (GSF), a framework for providing guaranteed QoS in on-chip networks in terms of minimum bandwidth and a maximum delay bound. The GSF framework can be easily integrated in a conventional virtual channel (VC) router without significantly increasing the hardware complexity. We rely on a fast barrier network, which is feasible in an on-chip environment, to efficiently implement GSF. Performance guarantees are verified by both analysis and simulation. According to our simulations, all concurrent flows receive their guaranteed minimum share of bandwidth in compliance with a given bandwidth allocation. The average throughput degradation of GSF on a 8x8 mesh network is within 10% compared to the conventional best-effort VC router in most cases.