Multi-table joins through bitmapped join indices
ACM SIGMOD Record
ACM Transactions on Database Systems (TODS)
Using Semi-Joins to Solve Relational Queries
Journal of the ACM (JACM)
Efficient execution of joins in a star schema
Proceedings of the 2002 ACM SIGMOD international conference on Management of data
Volcano An Extensible and Parallel Query Evaluation System
IEEE Transactions on Knowledge and Data Engineering
A Query Processing Strategy for the Decomposed Storage Model
Proceedings of the Third International Conference on Data Engineering
Block Oriented Processing of Relational Database Operations in Modern Computer Architectures
Proceedings of the 17th International Conference on Data Engineering
Weaving Relations for Cache Performance
Proceedings of the 27th International Conference on Very Large Data Bases
MIL primitives for querying a fragmented world
The VLDB Journal — The International Journal on Very Large Data Bases
Buffering databse operations for enhanced instruction cache performance
SIGMOD '04 Proceedings of the 2004 ACM SIGMOD international conference on Management of data
QPipe: a simultaneously pipelined relational query engine
Proceedings of the 2005 ACM SIGMOD international conference on Management of data
C-store: a column-oriented DBMS
VLDB '05 Proceedings of the 31st international conference on Very large data bases
Super-Scalar RAM-CPU Cache Compression
ICDE '06 Proceedings of the 22nd International Conference on Data Engineering
Integrating compression and execution in column-oriented database systems
Proceedings of the 2006 ACM SIGMOD international conference on Management of data
Performance tradeoffs in read-optimized databases
VLDB '06 Proceedings of the 32nd international conference on Very large data bases
Efficient columnar storage in B-trees
ACM SIGMOD Record
VLDB '02 Proceedings of the 28th international conference on Very Large Data Bases
Query execution in column-oriented database systems
Query execution in column-oriented database systems
Adjoined Dimension Column Clustering to Improve Data Warehouse Query Performance
ICDE '08 Proceedings of the 2008 IEEE 24th International Conference on Data Engineering
Read-Optimized, Cache-Conscious, Page Layouts for Temporal Relational Data
DEXA '08 Proceedings of the 19th international conference on Database and Expert Systems Applications
Read-optimized databases, in depth
Proceedings of the VLDB Endowment
Fast scans and joins using flash drives
Proceedings of the 4th international workshop on Data management on new hardware
DSM vs. NSM: CPU performance tradeoffs in block-oriented query processing
Proceedings of the 4th international workshop on Data management on new hardware
Spyglass: fast, scalable metadata search for large-scale storage systems
FAST '09 Proccedings of the 7th conference on File and storage technologies
SW-Store: a vertically partitioned DBMS for Semantic Web data management
The VLDB Journal — The International Journal on Very Large Data Bases
Query processing techniques for solid state drives
Proceedings of the 2009 ACM SIGMOD International Conference on Management of data
Adaptive Physical Design for Curated Archives
SSDBM 2009 Proceedings of the 21st International Conference on Scientific and Statistical Database Management
MapReduce and parallel DBMSs: friends or foes?
Communications of the ACM - Amir Pnueli: Ahead of His Time
Column-oriented database systems
Proceedings of the VLDB Endowment
Probabilistic ranking over relations
Proceedings of the 13th International Conference on Extending Database Technology
Proceedings of the 13th International Conference on Extending Database Technology
Adaptive query processing in data stream management systems under limited memory resources
PIKM '10 Proceedings of the 3rd workshop on Ph.D. students in information and knowledge management
SSDBM'10 Proceedings of the 22nd international conference on Scientific and statistical database management
MOSS-DB: a hardware-aware OLAP database
WAIM'10 Proceedings of the 11th international conference on Web-age information management
NetStore: an efficient storage infrastructure for network forensics and monitoring
RAID'10 Proceedings of the 13th international conference on Recent advances in intrusion detection
Database compression on graphics processors
Proceedings of the VLDB Endowment
Cheetah: a high performance, custom data warehouse on top of MapReduce
Proceedings of the VLDB Endowment
Assessing and optimizing microarchitectural performance of event processing systems
TPCTC'10 Proceedings of the Second TPC technology conference on Performance evaluation, measurement and characterization of complex systems
Efficient and scalable data evolution with column oriented databases
Proceedings of the 14th International Conference on Extending Database Technology
SLA-tree: a framework for efficiently supporting SLA-based decisions in cloud computing
Proceedings of the 14th International Conference on Extending Database Technology
ACM Transactions on Reconfigurable Technology and Systems (TRETS)
Column-oriented storage techniques for MapReduce
Proceedings of the VLDB Endowment
SQL server column store indexes
Proceedings of the 2011 ACM SIGMOD International Conference on Management of data
An analytic data engine for visualization in tableau
Proceedings of the 2011 ACM SIGMOD International Conference on Management of data
Improving performance by creating a native join-index for OLAP
Frontiers of Computer Science in China
GBASE: a scalable and general graph management system
Proceedings of the 17th ACM SIGKDD international conference on Knowledge discovery and data mining
Multi-core vs. I/O wall: the approaches to conquer and cooperate
WAIM'11 Proceedings of the 12th international conference on Web-age information management
Trojan data layouts: right shoes for a running elephant
Proceedings of the 2nd ACM Symposium on Cloud Computing
Proceedings of the ACM international conference companion on Object oriented programming systems languages and applications companion
ISABELA-QA: query-driven analytics with ISABELA-compressed extreme-scale scientific data
Proceedings of 2011 International Conference for High Performance Computing, Networking, Storage and Analysis
Proceedings of the ACM 14th international workshop on Data Warehousing and OLAP
Column-oriented query processing for row stores
Proceedings of the ACM 14th international workshop on Data Warehousing and OLAP
Aggregation strategies for columnar in-memory databases in a mixed workload
Proceedings of the 4th workshop on Workshop for Ph.D. students in information & knowledge management
Improving the efficiency of subset queries on raster images
Proceedings of the ACM SIGSPATIAL Second International Workshop on High Performance and Distributed Geographic Information Systems
Progressive processing of subspace dominating queries
The VLDB Journal — The International Journal on Very Large Data Bases
ECOS: evolutionary column-oriented storage
BNCOD'11 Proceedings of the 28th British national conference on Advances in databases
Collection and exploration of large data monitoring sets using bitmap databases
TMA'10 Proceedings of the Second international conference on Traffic Monitoring and Analysis
MCJoin: a memory-constrained join for column-store main-memory databases
SIGMOD '12 Proceedings of the 2012 ACM SIGMOD International Conference on Management of Data
X-device query processing by bitwise distribution
DaMoN '12 Proceedings of the Eighth International Workshop on Data Management on New Hardware
Clydesdale: structured data processing on MapReduce
Proceedings of the 15th International Conference on Extending Database Technology
CDDTA-JOIN: one-pass OLAP algorithm for column-oriented databases
APWeb'12 Proceedings of the 14th Asia-Pacific international conference on Web Technologies and Applications
A flash-based decomposition storage model
DASFAA'12 Proceedings of the 17th international conference on Database Systems for Advanced Applications
An alert correlation platform for memory-supported techniques
Concurrency and Computation: Practice & Experience
Real-time creation of bitmap indexes on streaming network data
The VLDB Journal — The International Journal on Very Large Data Bases
Reordering rows for better compression: Beyond the lexicographic order
ACM Transactions on Database Systems (TODS)
U2SOD-DB: a database system to manage large-scale ubiquitous urban sensing origin-destination data
Proceedings of the ACM SIGKDD International Workshop on Urban Computing
Towards a hybrid row-column database for a cloud-based medical data management system
Proceedings of the 1st International Workshop on Cloud Intelligence
Processing a trillion cells per mouse click
Proceedings of the VLDB Endowment
A methodology for managing database and code changes in a regression testing framework
Proceedings of the 3rd annual conference on Systems, programming, and applications: software for humanity
gbase: an efficient analysis platform for large graphs
The VLDB Journal — The International Journal on Very Large Data Bases
A positional access method for relational databases
Proceedings of the 21st ACM international conference on Information and knowledge management
Automatic selection of processing units for coprocessing in databases
ADBIS'12 Proceedings of the 16th East European conference on Advances in Databases and Information Systems
Sliced column-store (SCS): ontological foundations and practical implications
ER'12 Proceedings of the 31st international conference on Conceptual Modeling
Indexing dataspaces with partitions
World Wide Web
ADC '11 Proceedings of the Twenty-Second Australasian Database Conference - Volume 115
Processing analytical queries over encrypted data
Proceedings of the VLDB Endowment
Future Generation Computer Systems
Cache conscious star-join in MapReduce environments
Proceedings of the 2nd International Workshop on Cloud Intelligence
Audience segment expansion using distributed in-database k-means clustering
Proceedings of the Seventh International Workshop on Data Mining for Online Advertising
Keyword oriented bitmap join index for in-memory analytical processing
WAIM'13 Proceedings of the 14th international conference on Web-Age Information Management
The Yin and Yang of processing data warehousing queries on GPU devices
Proceedings of the VLDB Endowment
Design and evaluation of storage organizations for read-optimized main memory databases
Proceedings of the VLDB Endowment
Ultrawrap: SPARQL execution on relational data
Web Semantics: Science, Services and Agents on the World Wide Web
Hi-index | 0.00 |
There has been a significant amount of excitement and recent work on column-oriented database systems ("column-stores"). These database systems have been shown to perform more than an order of magnitude better than traditional row-oriented database systems ("row-stores") on analytical workloads such as those found in data warehouses, decision support, and business intelligence applications. The elevator pitch behind this performance difference is straightforward: column-stores are more I/O efficient for read-only queries since they only have to read from disk (or from memory) those attributes accessed by a query. This simplistic view leads to the assumption that one can obtain the performance benefits of a column-store using a row-store: either by vertically partitioning the schema, or by indexing every column so that columns can be accessed independently. In this paper, we demonstrate that this assumption is false. We compare the performance of a commercial row-store under a variety of different configurations with a column-store and show that the row-store performance is significantly slower on a recently proposed data warehouse benchmark. We then analyze the performance difference and show that there are some important differences between the two systems at the query executor level (in addition to the obvious differences at the storage layer level). Using the column-store, we then tease apart these differences, demonstrating the impact on performance of a variety of column-oriented query execution techniques, including vectorized query processing, compression, and a new join algorithm we introduce in this paper. We conclude that while it is not impossible for a row-store to achieve some of the performance advantages of a column-store, changes must be made to both the storage layer and the query executor to fully obtain the benefits of a column-oriented approach.