Speeding-up multiprocessors running DBMS workloads through coherence protocols

Authors:
Pierfrancesco Foglia;Roberto Giorgi;Cosimo Antonio Prete
Affiliations:
Dipartimento di Ingegneria dell;Informazione, University of Pisa, Via Diotisalvi 2, Pisa 56100, Italy.;Dipartimento di Ingegneria dell
Venue:
International Journal of High Performance Computing and Networking
Year:
2004

Citing 44
Cited 1

A class of compatible cache consistency protocols and their support by the IEEE futurebus

ISCA '86 Proceedings of the 13th annual international symposium on Computer architecture
A characterization of sharing in parallel programs and its application to coherency protocol evaluation

ISCA '88 Proceedings of the 15th Annual International Symposium on Computer architecture
The effect of sharing on the cache and bus performance of parallel programs

ASPLOS III Proceedings of the third international conference on Architectural support for programming languages and operating systems
Evaluating the performance of four snooping cache coherency protocols

ISCA '89 Proceedings of the 16th annual international symposium on Computer architecture
Address Tracing for Parallel Machines

Computer - Special issue on experimental research in computer architecture
Performance evaluation of memory consistency models for shared-memory multiprocessors

ASPLOS IV Proceedings of the fourth international conference on Architectural support for programming languages and operating systems
Simplicity Versus Accuracy in a Model of Cache Coherency Overhead

IEEE Transactions on Computers
Characterizing the caching and synchronization performance of a multiprocessor operating system

ASPLOS V Proceedings of the fifth international conference on Architectural support for programming languages and operating systems
The detection and elimination of useless misses in multiprocessors

ISCA '93 Proceedings of the 20th annual international symposium on computer architecture
Adaptive cache coherency for detecting migratory shared data

ISCA '93 Proceedings of the 20th annual international symposium on computer architecture
An adaptive cache coherence protocol optimized for migratory sharing

ISCA '93 Proceedings of the 20th annual international symposium on computer architecture
Characterization of alpha AXP performance using TP and SPEC workloads

ISCA '94 Proceedings of the 21st annual international symposium on Computer architecture
Scheduling and page migration for multiprocessor compute servers

ASPLOS VI Proceedings of the sixth international conference on Architectural support for programming languages and operating systems
Contrasting characteristics and cache performance of technical and multi-user commercial workloads

ASPLOS VI Proceedings of the sixth international conference on Architectural support for programming languages and operating systems
Reducing false sharing on shared memory multiprocessors through compile time data transformations

PPOPP '95 Proceedings of the fifth ACM SIGPLAN symposium on Principles and practice of parallel programming
Memory system performance of UNIX on CC-NUMA multiprocessors

Proceedings of the 1995 ACM SIGMETRICS joint international conference on Measurement and modeling of computer systems
An analysis of degenerate sharing and false coherence

Journal of Parallel and Distributed Computing
STiNG: a CC-NUMA computer system for the commercial marketplace

ISCA '96 Proceedings of the 23rd annual international symposium on Computer architecture
Trace-driven memory simulation: a survey

ACM Computing Surveys (CSUR)
Memory system characterization of commercial workloads

Proceedings of the 25th annual international symposium on Computer architecture
Performance characterization of a Quad Pentium Pro SMP using OLTP workloads

Proceedings of the 25th annual international symposium on Computer architecture
An analysis of database workload performance on simultaneous multithreaded processors

Proceedings of the 25th annual international symposium on Computer architecture
Pentium Pro and Pentium II system architecture (2nd ed.)

Pentium Pro and Pentium II system architecture (2nd ed.)
Performance of database workloads on shared-memory systems with out-of-order processors

Proceedings of the eighth international conference on Architectural support for programming languages and operating systems
PSCR: A Coherence Protocol for Eliminating Passive Sharing in Shared-Bus Shared-Memory Multiprocessors

IEEE Transactions on Parallel and Distributed Systems
Analysis of Cache Performance for Operating Systems and Multiprogramming

Analysis of Cache Performance for Operating Systems and Multiprogramming
Parallel Computer Architecture: A Hardware/Software Approach

Parallel Computer Architecture: A Hardware/Software Approach
The Cache-Coherence Problem in Shared-Memory Multiprocessors: Hardware Solutions

The Cache-Coherence Problem in Shared-Memory Multiprocessors: Hardware Solutions
Computer architecture: a quantitative approach

Computer architecture: a quantitative approach
Trace Factory: Generating Workloads for Trace-Driven Simulation of Shared-Bus Multiprocessors

IEEE Parallel & Distributed Technology: Systems & Technology
Trends in Shared Memory Multiprocessing

Computer
Hardware Approaches Coherence in Shared-Memory Multiprocessors, Part 1

IEEE Micro
Hardware Approaches to Cache Coherence in Shared-Memory Multiprocessors Part 2

IEEE Micro
The MIPS R10000 Superscalar Microprocessor

IEEE Micro
Simultaneous Multithreading: A Platform for Next-Generation Processors

IEEE Micro
False Sharing and Spatial Locality in Multiprocessor Caches

IEEE Transactions on Computers
A Trace-Driven Simulator for Performance Evaluation of Cache-Based Multiprocessor Systems

IEEE Transactions on Parallel and Distributed Systems
Guest Editors' Introduction: Evaluating Servers with Commercial Workloads

Computer
Comparing the Memory System Performance of DSS Workloads on the HP V-Class and SGI Origin 2000

IPDPS '02 Proceedings of the 16th International Parallel and Distributed Processing Symposium
Analysis of Sharing Overhead in Shared Memory Multiprocessors

HICSS '98 Proceedings of the Thirty-First Annual Hawaii International Conference on System Sciences-Volume 7 - Volume 7
Lockup-free instruction fetch/prefetch cache organization

ISCA '81 Proceedings of the 8th annual symposium on Computer Architecture
The Memory Performance of DSS Commercial Workloads in Shared-Memory Multiprocessors

HPCA '97 Proceedings of the 3rd IEEE Symposium on High-Performance Computer Architecture
Detailed Characterization of a Quad Pentium Pro Server Running TPC-D

ICCD '99 Proceedings of the 1999 IEEE International Conference on Computer Design
Structured Computer Organization (5th Edition)

Structured Computer Organization (5th Edition)

Adaptive hybrid partitioning for OLAP query processing in a database cluster

International Journal of High Performance Computing and Networking

Quantified Score

Hi-index	0.00

Visualization

Abstract

In this work, it is shown how a DBMS workload, running on a shared-bus shared-memory multiprocessor, can be accelerated by adding simple support to the MESI coherence protocol. As a DBMS workload, we choose the TPC-D benchmark running on the PostgreSQL DBMS. Results show that, for a DSS workload, the use of a WU protocol with a selective invalidation strategy for private data improves performance because of the access pattern to shared data and the lower bus utilisation due to the absence of invalidation miss, when the contribution of passive sharing is eliminated. In the 16 processor case, the advantage can be quantified in a 20% of increased performance. Finally, it is shown how results can be extended to other DBMS workloads.