Memory-, bandwidth-, and power-aware multi-core for a graph database workload

Authors:
Pedro Trancoso;Norbert Martinez;Josep-Lluis Larriba-Pey
Affiliations:
Department of Computer Science, University of Cyprus, Nicosia, Cyprus;DAMA, UPC, Universitat Politècnica de Catalunya, Barcelona, Catalonia;DAMA, UPC, Universitat Politècnica de Catalunya, Barcelona, Catalonia
Venue:
ARCS'11 Proceedings of the 24th international conference on Architecture of computing systems
Year:
2011

Citing 14
Cited 0

The case for a single-chip multiprocessor

Proceedings of the seventh international conference on Architectural support for programming languages and operating systems
Piranha: a scalable architecture based on single-chip multiprocessing

Proceedings of the 27th annual international symposium on Computer architecture
Single-ISA Heterogeneous Multi-Core Architectures: The Potential for Processor Power Reduction

Proceedings of the 36th annual IEEE/ACM International Symposium on Microarchitecture
Single-ISA Heterogeneous Multi-Core Architectures for Multithreaded Workload Performance

Proceedings of the 31st annual international symposium on Computer architecture
Adaptive Cache Compression for High-Performance Processors

Proceedings of the 31st annual international symposium on Computer architecture
Fair Cache Sharing and Partitioning in a Chip Multiprocessor Architecture

Proceedings of the 13th International Conference on Parallel Architectures and Compilation Techniques
Niagara: A 32-Way Multithreaded Sparc Processor

IEEE Micro
The V-Way Cache: Demand Based Associativity via Global Replacement

Proceedings of the 32nd annual international symposium on Computer Architecture
Cooperative Caching for Chip Multiprocessors

Proceedings of the 33rd annual international symposium on Computer Architecture
Valgrind: a framework for heavyweight dynamic binary instrumentation

Proceedings of the 2007 ACM SIGPLAN conference on Programming language design and implementation
Dex: high-performance exploration on large graphs for information retrieval

Proceedings of the sixteenth ACM conference on Conference on information and knowledge management
Amdahl's Law in the Multicore Era

Computer
Comparative evaluation of memory models for chip multiprocessors

ACM Transactions on Architecture and Code Optimization (TACO)
Scaling the bandwidth wall: challenges in and avenues for CMP scaling

Proceedings of the 36th annual international symposium on Computer architecture

Quantified Score

Hi-index	0.00

Visualization

Abstract

Processors have evolved to the now de-facto standard multicore architecture. The continuous advances in technology allow for increased component density, thus resulting in a larger number of cores on the chip. This, in turn, places pressure on the off-chip and pin bandwidth. Large Last-Level Caches (LLC), which are shared among all cores, have been used as a way to control the out-of-chip requests. In this work we focus on analyzing the memory behavior of a modern demanding application, a graph-based database workload, which is representative of future workloads. We analyze the performance of this application for different cache configurations in terms of: memory access time, bandwidth requirements, and power consumption. The experimental results show that the bandwidth requirements reduce as the number of clusters reduces and the LLC per cluster increases. This configuration is also the most power efficient. If on the other hand, memory latency is the dominant factor, assuming bandwidth is not a limitation, then the best configuration is the one with more clusters and smaller LLCs.