Index scans using a finite LRU buffer: a validated I/O model
ACM Transactions on Database Systems (TODS)
Query evaluation techniques for large databases
ACM Computing Surveys (CSUR)
ACM Computing Surveys (CSUR)
An overview of query optimization in relational systems
PODS '98 Proceedings of the seventeenth ACM SIGACT-SIGMOD-SIGART symposium on Principles of database systems
Access path selection in a relational database management system
SIGMOD '79 Proceedings of the 1979 ACM SIGMOD international conference on Management of data
Proceedings of the eleventh international joint conference on Measurement and modeling of computer systems
Join processing for flash SSDs: remembering past lessons
Proceedings of the Fifth International Workshop on Data Management on New Hardware
Issues in Benchmark Metric Selection
Performance Evaluation and Benchmarking
On the Performance of Database Query Processing Algorithms on Flash Solid State Disks
DEXA '11 Proceedings of the 2011 22nd International Workshop on Database and Expert Systems Applications
Hi-index | 0.00 |
The architecture and algorithms of database systems have been built around the properties of existing hardware technologies. Many such elementary design assumptions are 20--30 years old. Over the last five years we witness multiple new I/O technologies (e.g. Flash SSDs, NV-Memories) that have the potential of changing these assumptions. Some of the key technological differences to traditional spinning disk storage are: (i) asymmetric read/write performance; (ii) low latencies; (iii) fast random reads; (iv) endurance issues. Cost functions used by traditional database query optimizers are directly influenced by these properties. Most cost functions estimate the cost of algorithms based on metrics such as sequential and random I/O costs besides CPU and memory consumption. These do not account for asymmetry or high random read and inferior random write performance, which represents a significant mismatch. In the present paper we show a new asymmetry-aware cost model for Flash SSDs with adapted cost functions for algorithms such as external sort, hash-join, sequential scan, index scan, etc. It has been implemented in PostgreSQL and tested with TPC-H. Additionally we describe a tool that automatically finds good settings for the base coefficients of cost models. After tuning the configuration of both the original and the asymmetry-aware cost model with that tool, the optimizer with the asymmetry-aware cost model selects faster execution plans for 14 out of the 22 TPC-H queries (the rest being the same or negligibly worse). We achieve an overall performance improvement of 48% on SSD.