Fast scans and joins using flash drives

Authors:
Mehul A. Shah;Stavros Harizopoulos;Janet L. Wiener;Goetz Graefe
Affiliations:
HP Labs;HP Labs;HP Labs;HP Labs
Venue:
Proceedings of the 4th international workshop on Data management on new hardware
Year:
2008

Citing 10
Cited 18

Weaving Relations for Cache Performance

Proceedings of the 27th International Conference on Very Large Data Bases
Fast joins using join indices

The VLDB Journal — The International Journal on Very Large Data Bases
C-store: a column-oriented DBMS

VLDB '05 Proceedings of the 31st international conference on Very large data bases
Performance tradeoffs in read-optimized databases

VLDB '06 Proceedings of the 32nd international conference on Very large data bases
FlashCache: a NAND flash memory file cache for low power web servers

CASES '06 Proceedings of the 2006 international conference on Compilers, architecture and synthesis for embedded systems
Design of flash-based DBMS: an in-page logging approach

Proceedings of the 2007 ACM SIGMOD international conference on Management of data
JouleSort: a balanced energy-efficiency benchmark

Proceedings of the 2007 ACM SIGMOD international conference on Management of data
Models and Metrics to Enable Energy-Efficiency Optimizations

Computer
The five-minute rule twenty years later, and how flash memory changes the rules

DaMoN '07 Proceedings of the 3rd international workshop on Data management on new hardware
Column-stores vs. row-stores: how different are they really?

Proceedings of the 2008 ACM SIGMOD international conference on Management of data

The promise of solid state disks: increasing efficiency and reducing cost of DBMS processing

C3S2E '09 Proceedings of the 2nd Canadian Conference on Computer Science and Software Engineering
Query processing techniques for solid state drives

Proceedings of the 2009 ACM SIGMOD International Conference on Management of data
FlashLogging: exploiting flash devices for synchronous logging performance

Proceedings of the 2009 ACM SIGMOD International Conference on Management of data
Join processing for flash SSDs: remembering past lessons

Proceedings of the Fifth International Workshop on Data Management on New Hardware
SPAX --- PAX with Super-Pages

ADBIS '09 Proceedings of the 13th East European Conference on Advances in Databases and Information Systems
An object placement advisor for DB2 using solid state storage

Proceedings of the VLDB Endowment
Enhancing energy efficiency of database applications using SSDs

Proceedings of the Third C* Conference on Computer Science and Software Engineering
Vertical partitioning for flash and HDD database systems

Journal of Systems and Software
On the impact of flash SSDs on spatial indexing

Proceedings of the Sixth International Workshop on Data Management on New Hardware
FAST: a generic framework for flash-aware spatial trees

SSTD'11 Proceedings of the 12th international conference on Advances in spatial and temporal databases
Towards cost-effective storage provisioning for DBMSs

Proceedings of the VLDB Endowment
A study of space reclamation on flash-based append-only storage management

DASFAA'12 Proceedings of the 17th international conference on Database Systems for Advanced Applications
A flash-based decomposition storage model

DASFAA'12 Proceedings of the 17th international conference on Database Systems for Advanced Applications
The impact of solid state drive on search engine cache management

Proceedings of the 36th international ACM SIGIR conference on Research and development in information retrieval
Generic and efficient framework for search trees on flash memory storage systems

Geoinformatica
Optimizing OLAP cube processing on solid state drives

Proceedings of the sixteenth international workshop on Data warehousing and OLAP
Scan and join optimization by exploiting internal parallelism of flash-based solid state drives

WAIM'13 Proceedings of the 14th international conference on Web-Age Information Management
Can SSDs help reduce random i/os in hash joins?

WAIM'13 Proceedings of the 14th international conference on Web-Age Information Management

Quantified Score

Hi-index	0.00

Visualization

Abstract

As access times to main memory and disks continue to diverge, faster non-volatile storage technologies become more attractive for speeding up data analysis applications. NAND flash is one such promising substitute for disks. Flash offers faster random reads than disk, consumes less power than disk, and is cheaper than DRAM. In this paper, we investigate alternative data layouts and join algorithms suited for systems that use flash drives as the non-volatile store. All of our techniques take advantage of the fast random reads of flash. We convert traditional sequential I/O algorithms to ones that use a mixture of sequential and random I/O to process less data in less time. Our measurements on commodity flash drives show that a column-major layout of data pages is faster than a traditional row-based layout for simple scans. We present a new join algorithm, RARE-join, designed for a column-based page layout on flash and compare it to a traditional hash join algorithm. Our analysis shows that RARE-join is superior in many practical cases: when join selectivities are small and only a few columns are projected in the join result.