A performance comparison of multi-micro and mainframe database architectures

  • Authors:
  • Philip Heidelberger;Seetha Lakshmi

  • Affiliations:
  • IBM Thomas J. Watson Research Center, P.O. Box 218, Yorktown Heights, New York;IBM Thomas J. Watson Research Center, P.O. Box 218, Yorktown Heights, New York

  • Venue:
  • SIGMETRICS '87 Proceedings of the 1987 ACM SIGMETRICS conference on Measurement and modeling of computer systems
  • Year:
  • 1987

Quantified Score

Hi-index 0.01

Visualization

Abstract

Database machine architectures consisting of multiple microprocessors or mini-computers are attracting wide attention. There have been several proposals and prototypes (see, e.g., DeWitt, Gerber, Graefe, Heytens, Kumar and Muralikrishna (1986), Fishman, Lai and Wilkinson (1984), Hsiao (1983), or the 1983 and 1985 Proceedings of the International Workshop on Database Machines). There is also a commercially available system based on multiple microprocessors (Teradata (1984)). With these architectures it is possible to exploit parallelism at three levels: within a single query, within a single transaction, and by simultaneously executing multiple independent transactions. The rationale behind these multiple microprocessor architectures is primarily to take advantage of the potential lower cost per MIPS (Millions of Instructions per Second, a measure of processing power) of microprocessors as opposed to mainframes. In addition, database machines may offer incremental capacity growth as well as improved performance for large queries by exploiting parallelism within a single query. However, it is not clear if database machines made of multiple microprocessors indeed have any cost/performance advantage over a more conventional mainframe based database management systems. Several papers on the performance analysis of database machines can be found in the literature (e.g., Salza, Terranova and Velardi (1983) or Bit and Hartman (1985)). Most of these studies have focused on determining the execution time of a single query in a particular database machine architecture.Few studies have dealt with the response time of single queries in a multi-user environment. We are not aware of any papers that systematically study the performance trade-offs between a multi-microprocessor database machine and a large mainframe system. This paper presents such a systematic study.We examine a hypothetical database machine that uses standard microprocessors and disks; database machines that use special purpose hardware are not considered here (e.g., Sakai, Kamiya, Iwata, Abe, Tanaka, Shibayama and Murakami (1984)). However, we do not limit our studies to the components available today; we also consider processors and disks projected to be available in the future. We assume that both the database machine and the mainframe provide relational database functions (e.g., Date (1986)). While there are several applications for relational database (on-line transaction processing, ad-hoc queries, etc.), we limit our attention to one specific application domain; namely high volume on-line transaction processing. In this domain, we consider a range of transactions and investigate the sensitivity of the two architectures to various transaction related parameters. Dias, Iyer and Yu (1986), in a similar study, have investigated the issue of coupling many small systems to obtain comparable performance of a few (coupled) large systems. Their study is limited to a specific workload with no parametric or sensitivity study with respect to transaction characteristics and the architectures they compared are quite different from the database machine considered in this paper.For high volume transaction processing environments, there appears to be only a limited potential to exploit parallelism within a single transaction. It is therefore expected that since the database machine is made of slower processors and since the functions are distributed across several processors, it would require more aggregate processing capacity, or MIPS, than the mainframe to sustain a given throughput and a response time. Thus there is a trade-off between the cheaper cost per MIPS of microprocessors as opposed to mainframes and the increase in aggregate MIPS required by the database machine to achieve a given performance level. This paper addresses this trade-off through the use of queueing network performance models of the two architectures.Assuming that the MIPS ratings of the microprocessor and mainframe are equivalent, our models indicate that with today's processor technology, the performance of the database machine is sensitive to the transaction complexity, the amount of skew in the data access pattern, the amount of overhead required to implement the distributed database function and the buffer miss ratio. Furthermore, there is only a narrow range of transaction processing workloads for which the database machine can meet a prespecified response time objective with only a moderate increase in aggregate processing capacity over that of the mainframe. However, using the technology projected for the early 1990's, our models predict that the performance of the hypothetical database machine is less sensitive to the above factors. Assuming that the level of lock contention is low, the memory hierarchies of the two architectures are equivalent (in the sense of achieving equal buffer miss ratios), and the performance of disks are equivalent in the two architectures, the models predict that the performance objective can be met with only a moderate increase in aggregate capacity for a broader range of transaction workloads.The workloads considered in this paper consist of relatively short transactions based on primary key retrievals and updates. It is therefore difficult to make general conclusions about the overall superiority of one architecture against the other when a mixed set of workloads is expected (our study assumes that all transactions have the same expected pathlength and I/O activity). This study focused on performance issues and specifically does not address such issues as MIPS flexibility (general purpose versus special purpose architectures), security, recovery and system management.