Scaling and Charact rizing Database Workloads: Bridging the Gap between Research and Practice

Authors:
Richard A. Hankins;Trung Diep;Murali Annavaram;Brian Hirano;Harald Eri;Hubert Nueckel;John P. Shen
Affiliations:
Microprocessor Research Labs (MRL) and University of Michigan;Microprocessor Research Labs (MRL);Microprocessor Research Labs (MRL);Server Technologies, Oracle Corporation;Server Technologies, Oracle Corporation;Software Solutions Group, Intel Corporation;Microprocessor Research Labs (MRL)
Venue:
Proceedings of the 36th annual IEEE/ACM International Symposium on Microarchitecture
Year:
2003

Citing 12
Cited 21

Characterization of alpha AXP performance using TP and SPEC workloads

ISCA '94 Proceedings of the 21st annual international symposium on Computer architecture
Commercial workload performance in the IBM POWER2 RISC System/6000 processor

IBM Journal of Research and Development
The impact of architectural trends on operating system performance

SOSP '95 Proceedings of the fifteenth ACM symposium on Operating systems principles
The case for a single-chip multiprocessor

Proceedings of the seventh international conference on Architectural support for programming languages and operating systems
Memory system characterization of commercial workloads

Proceedings of the 25th annual international symposium on Computer architecture
Performance characterization of a Quad Pentium Pro SMP using OLTP workloads

Proceedings of the 25th annual international symposium on Computer architecture
An analysis of database workload performance on simultaneous multithreaded processors

Proceedings of the 25th annual international symposium on Computer architecture
Performance of database workloads on shared-memory systems with out-of-order processors

Proceedings of the eighth international conference on Architectural support for programming languages and operating systems
Piranha: a scalable architecture based on single-chip multiprocessing

Proceedings of the 27th annual international symposium on Computer architecture
DBMSs on a Modern Processor: Where Does Time Go?

VLDB '99 Proceedings of the 25th International Conference on Very Large Data Bases
Variability in Architectural Simulations of Multi-Threaded Workloads

HPCA '03 Proceedings of the 9th International Symposium on High-Performance Computer Architecture
Branch Behavior of a Commercial OLTP Workload on Intel IA32 Processors

ICCD '02 Proceedings of the 2002 IEEE International Conference on Computer Design: VLSI in Computers and Processors (ICCD'02)

Microarchitecture Optimizations for Exploiting Memory-Level Parallelism

Proceedings of the 31st annual international symposium on Computer architecture
The Fuzzy Correlation between Code and Performance Predictability

Proceedings of the 37th annual IEEE/ACM International Symposium on Microarchitecture
Compiler Optimizations for Transaction Processing Workloads on Itanium® Linux Systems

Proceedings of the 37th annual IEEE/ACM International Symposium on Microarchitecture
Temporal Streaming of Shared Memory

Proceedings of the 32nd annual international symposium on Computer Architecture
Efficient behavior-driven runtime dynamic voltage scaling policies

CODES+ISSS '05 Proceedings of the 3rd IEEE/ACM/IFIP international conference on Hardware/software codesign and system synthesis
Store Memory-Level Parallelism Optimizations for Commercial Applications

Proceedings of the 38th annual IEEE/ACM International Symposium on Microarchitecture
DBmbench: fast and accurate database workload representation on modern microarchitecture

CASCON '05 Proceedings of the 2005 conference of the Centre for Advanced Studies on Collaborative research
Spatial Memory Streaming

Proceedings of the 33rd annual international symposium on Computer Architecture
Large scale Itanium® 2 processor OLTP workload characterization and optimization

DaMoN '06 Proceedings of the 2nd international workshop on Data management on new hardware
SimFlex: Statistical Sampling of Computer System Simulation

IEEE Micro
Computation spreading: employing hardware migration to specialize CMP cores on-the-fly

Proceedings of the 12th international conference on Architectural support for programming languages and operating systems
Managing energy-performance tradeoffs for multithreaded applications on multiprocessor architectures

Proceedings of the 2007 ACM SIGMETRICS international conference on Measurement and modeling of computer systems
Energy efficient near-threshold chip multi-processing

ISLPED '07 Proceedings of the 2007 international symposium on Low power electronics and design
ProtoFlex: Towards Scalable, Full-System Multiprocessor Simulations Using FPGAs

ACM Transactions on Reconfigurable Technology and Systems (TRETS)
Spatio-temporal memory streaming

Proceedings of the 36th annual international symposium on Computer architecture
Machine learning-based prefetch optimization for data center applications

Proceedings of the Conference on High Performance Computing Networking, Storage and Analysis
Modeling virtual machine performance: challenges and approaches

ACM SIGMETRICS Performance Evaluation Review
Mitigating the impact of variability on chip-multiprocessor power and performance

IEEE Transactions on Very Large Scale Integration (VLSI) Systems
Analyzing the effects of hyperthreading on the performance of data management systems

International Journal of Parallel Programming
B+-tree index optimization by exploiting internal parallelism of flash-based solid state drives

Proceedings of the VLDB Endowment
Exploiting process variability in voltage/frequency control

IEEE Transactions on Very Large Scale Integration (VLSI) Systems

Quantified Score

Hi-index	0.00

Visualization

Abstract

On-ine Transaction Processing (OLTP) workloads arecrucial benchmarks for the design and analysis of serverprocessors. Typical cached configurations used byresearchers to simulate OLTP workloads are orders ofmagnitude smaller than the fully scaled configurationsused by OEM vendors to achieve world-record transactionprocessing throughput. The objective of this study is todiscover the underlying relationships that characterizeOLTP performance over a wide range of configurations.To this end, we have derived the "iron law" of databaseperformance. Using our iron law, we show that both theaverage instructions executed per transaction (IPX) andthe average cycles per instruction (CPI) are critical to thetransaction-throughput performance. We use an extensive,empirical examination of an Oracle® based commercialOLTP workload on an Intel® XeonTM multiprocessorsystem to characterize the scaling behavior of both theIPX and the CPI. We demonstrate that across a widerange of configurations the IPX and CPI behavior followspredictable trends, which can be accurately characterizedby simple linear or piece-wise linear approximations.Based on our data,we propose a method for selecting aminimal, representative workload configuration fromwhich behaviors of much larger OLTP configurations canbe accurately extrapolated.