ACM Transactions on Database Systems (TODS)
Logging RAID - An Approach to Fast, Reliable, and Low-Cost Disk Arrays
Euro-Par '00 Proceedings from the 6th International Euro-Par Conference on Parallel Processing
Characteristics of production database workloads and the TPC benchmarks
IBM Systems Journal - End-to-end security
A multi-version cache replacement and prefetching policy for hybrid data delivery environments
VLDB '02 Proceedings of the 28th international conference on Very Large Data Bases
A mixed transaction processing and operational reporting benchmark
Information Systems Frontiers
Hi-index | 0.00 |
There has been very little empirical analysis of any real production database workloads. Although The Transaction Processing Performance Council benchmarks C (TPC-C) and D (TPC-D) have become the standard benchmarks for online transaction processing and decision support systems respectively, there has also not been any major effort to systematically analyze their workload characteristics, especially in relation to those of real production database workloads. In this paper, we examine the characteristics of the production database workloads of ten of the world''s largest corporations and we also compare them to TPC-C and TPC-D. We find that the production workloads exhibit a wide range of behavior; in some cases, the TPC benchmarks fall reasonably within the range of real workload behavior, and in other cases, the TPC benchmarks are not representative of the real workloads. While the two TPC benchmarks generally complement one another in reflecting the characteristics of the production workloads but there are still some aspects of the real workloads that are not represented by either of the benchmarks. Specifically, our analysis suggests that the TPC benchmarks tend to exercise the following aspects of the system differently than the production workloads: concurrency control mechanism (TPC-C tends to have longer transactions and fewer read-only transactions than the production workloads while some of TPC-D''s transactions are much longer but are read-only and are run serially), workload-adaptive techniques (the production workloads have I/O demands that are much more bursty), scheduling and resource allocation policies (unlike TPC-C whose transactions are very regular and TPC-D where the queries are run serially, the production workloads tend have many concurrent and diverse transactions), and I/O optimizations for temporary and index files (TPC-C has no I/O activity to temporary objects while most of TPC-D''s references are directed at index objects). In this paper, we also reexamine Amdahl''s rule of thumb for a typical data processing system (one bit of I/O for every instruction) and discover that both the TPC benchmarks and the production workloads generate on the order of 0.5 to 1.0 bit of logical I/O per instruction, surprisingly close to the much earlier figure.