Performance tradeoffs in read-optimized databases

Authors:
Stavros Harizopoulos;Velen Liang;Daniel J. Abadi;Samuel Madden
Affiliations:
MIT Computer Science and Artificial Intelligence Laboratory, Cambridge, MA;MIT Computer Science and Artificial Intelligence Laboratory, Cambridge, MA;MIT Computer Science and Artificial Intelligence Laboratory, Cambridge, MA;MIT Computer Science and Artificial Intelligence Laboratory, Cambridge, MA
Venue:
VLDB '06 Proceedings of the 32nd international conference on Very large data bases
Year:
2006

Citing 20
Cited 37

An Effective Approach to Vertical Partitioning for Physical Design of Relational Databases

IEEE Transactions on Software Engineering
Performance characterization of a Quad Pentium Pro SMP using OLTP workloads

Proceedings of the 25th annual international symposium on Computer architecture
Computer architecture (2nd ed.): a quantitative approach

Computer architecture (2nd ed.): a quantitative approach
A decomposition storage model

SIGMOD '85 Proceedings of the 1985 ACM SIGMOD international conference on Management of data
The implementation and performance of compressed databases

ACM SIGMOD Record
Compressing Relations and Indexes

ICDE '98 Proceedings of the Fourteenth International Conference on Data Engineering
Block Oriented Processing of Relational Database Operations in Modern Computer Architectures

Proceedings of the 17th International Conference on Data Engineering
Database Architecture Optimized for the New Bottleneck: Memory Access

VLDB '99 Proceedings of the 25th International Conference on Very Large Data Bases
DBMSs on a Modern Processor: Where Does Time Go?

VLDB '99 Proceedings of the 25th International Conference on Very Large Data Bases
Weaving Relations for Cache Performance

Proceedings of the 27th International Conference on Very Large Data Bases
Buffering databse operations for enhanced instruction cache performance

SIGMOD '04 Proceedings of the 2004 ACM SIGMOD international conference on Management of data
Integrating vertical and horizontal partitioning into automated physical database design

SIGMOD '04 Proceedings of the 2004 ACM SIGMOD international conference on Management of data
QPipe: a simultaneously pipelined relational query engine

Proceedings of the 2005 ACM SIGMOD international conference on Management of data
C-store: a column-oriented DBMS

VLDB '05 Proceedings of the 31st international conference on Very large data bases
Super-Scalar RAM-CPU Cache Compression

ICDE '06 Proceedings of the 22nd International Conference on Data Engineering
Integrating compression and execution in column-oriented database systems

Proceedings of the 2006 ACM SIGMOD international conference on Management of data
A case for fractured mirrors

VLDB '02 Proceedings of the 28th international conference on Very Large Data Bases
Data compression in Oracle

VLDB '03 Proceedings of the 29th international conference on Very large data bases - Volume 29
Clotho: decoupling memory page layout from storage organization

VLDB '04 Proceedings of the Thirtieth international conference on Very large data bases - Volume 30
Database architectures for new hardware

VLDB '04 Proceedings of the Thirtieth international conference on Very large data bases - Volume 30

How to barter bits for chronons: compression and bandwidth trade offs for database scans

Proceedings of the 2007 ACM SIGMOD international conference on Management of data
To share or not to share?

VLDB '07 Proceedings of the 33rd international conference on Very large data bases
Cooperative scans: dynamic bandwidth sharing in a DBMS

VLDB '07 Proceedings of the 33rd international conference on Very large data bases
RadixZip: linear time compression of token streams

VLDB '07 Proceedings of the 33rd international conference on Very large data bases
Column-stores vs. row-stores: how different are they really?

Proceedings of the 2008 ACM SIGMOD international conference on Management of data
Read-Optimized, Cache-Conscious, Page Layouts for Temporal Relational Data

DEXA '08 Proceedings of the 19th international conference on Database and Expert Systems Applications
Read-optimized databases, in depth

Proceedings of the VLDB Endowment
Main-memory scan sharing for multi-core CPUs

Proceedings of the VLDB Endowment
Architecture of a Database System

Foundations and Trends in Databases
Fast scans and joins using flash drives

Proceedings of the 4th international workshop on Data management on new hardware
Avoiding version redundancy for high performance reads in temporal databases

Proceedings of the 4th international workshop on Data management on new hardware
DSM vs. NSM: CPU performance tradeoffs in block-oriented query processing

Proceedings of the 4th international workshop on Data management on new hardware
Query processing techniques for solid state drives

Proceedings of the 2009 ACM SIGMOD International Conference on Management of data
Dictionary-based order-preserving string compression for main memory column stores

Proceedings of the 2009 ACM SIGMOD International Conference on Management of data
Self-organizing tuple reconstruction in column-stores

Proceedings of the 2009 ACM SIGMOD International Conference on Management of data
Fine-grained updates in database management systems for flash memory

Information Sciences: an International Journal
SPAX --- PAX with Super-Pages

ADBIS '09 Proceedings of the 13th East European Conference on Advances in Databases and Information Systems
Modular data storage with Anvil

Proceedings of the ACM SIGOPS 22nd symposium on Operating systems principles
Column-oriented database systems

Proceedings of the VLDB Endowment
SIMD-scan: ultra fast in-memory table scan using on-chip vector processing units

Proceedings of the VLDB Endowment
Affinity analysis of coded data sets

Proceedings of the 2009 EDBT/ICDT Workshops
Analyzing the energy efficiency of a database server

Proceedings of the 2010 ACM SIGMOD International Conference on Management of data
Vertical partitioning for flash and HDD database systems

Journal of Systems and Software
The performance of MapReduce: an in-depth study

Proceedings of the VLDB Endowment
Database compression on graphics processors

Proceedings of the VLDB Endowment
Efficient and scalable data evolution with column oriented databases

Proceedings of the 14th International Conference on Extending Database Technology
SQL server column store indexes

Proceedings of the 2011 ACM SIGMOD International Conference on Management of data
A case for micro-cellstores: energy-efficient data management on recycled smartphones

Proceedings of the Seventh International Workshop on Data Management on New Hardware
Optimizing write performance for read optimized databases

DASFAA'10 Proceedings of the 15th international conference on Database Systems for Advanced Applications - Volume Part II
tsdb: a compressed database for time series

TMA'12 Proceedings of the 4th international conference on Traffic Monitoring and Analysis
Real-time creation of bitmap indexes on streaming network data

The VLDB Journal — The International Journal on Very Large Data Bases
Query optimization with value path materialization in column-stored DWMS

Proceedings of the 3rd International Conference on Computing for Geospatial Research and Applications
U2SOD-DB: a database system to manage large-scale ubiquitous urban sensing origin-destination data

Proceedings of the ACM SIGKDD International Workshop on Urban Computing
Towards energy-efficient database cluster design

Proceedings of the VLDB Endowment
Sliced column-store (SCS): ontological foundations and practical implications

ER'12 Proceedings of the 31st international conference on Conceptual Modeling
Enhancements to SQL server column stores

Proceedings of the 2013 ACM SIGMOD International Conference on Management of Data
Design and evaluation of storage organizations for read-optimized main memory databases

Proceedings of the VLDB Endowment

Quantified Score

Hi-index	0.00

Visualization

Abstract

Database systems have traditionally optimized performance for write-intensive workloads. Recently, there has been renewed interest in architectures that optimize read performance by using column-oriented data representation and light-weight compression. This previous work has shown that under certain broad classes of workloads, column-based systems can outperform row-based systems. Previous work, however, has not characterized the precise conditions under which a particular query workload can be expected to perform better on a column-oriented database.In this paper we first identify the distinctive components of a read-optimized DBMS and describe our implementation of a high-performance query engine that can operate on both row and column-oriented data. We then use our prototype to perform an in-depth analysis of the tradeoffs between column and row-oriented architectures. We explore these tradeoffs in terms of disk bandwidth, CPU cache latency, and CPU cycles. We show that for most database workloads, a carefully designed column system can outperform a carefully designed row system, sometimes by an order of magnitude. We also present an analytical model to predict whether a given workload on a particular hardware configuration is likely to perform better on a row or column-based system.