Join processing in database systems with large main memories
ACM Transactions on Database Systems (TODS)
Query evaluation techniques for large databases
ACM Computing Surveys (CSUR)
CIKM '94 Proceedings of the third international conference on Information and knowledge management
SIGMOD '85 Proceedings of the 1985 ACM SIGMOD international conference on Management of data
Implementation techniques for main memory database systems
SIGMOD '84 Proceedings of the 1984 ACM SIGMOD international conference on Management of data
Data page layouts for relational databases on deep memory hierarchies
The VLDB Journal — The International Journal on Very Large Data Bases
Integrating Semi-Join-Reducers into State of the Art Query Processors
Proceedings of the 17th International Conference on Data Engineering
The VLDB Journal — The International Journal on Very Large Data Bases
C-store: a column-oriented DBMS
VLDB '05 Proceedings of the 31st international conference on Very large data bases
Performance tradeoffs in read-optimized databases
VLDB '06 Proceedings of the 32nd international conference on Very large data bases
Design of flash-based DBMS: an in-page logging approach
Proceedings of the 2007 ACM SIGMOD international conference on Management of data
Column-stores vs. row-stores: how different are they really?
Proceedings of the 2008 ACM SIGMOD international conference on Management of data
A case for flash memory ssd in enterprise database applications
Proceedings of the 2008 ACM SIGMOD international conference on Management of data
The Five-Minute Rule 20 Years Later: and How Flash Memory Changes the Rules
Queue - Enterprise Flash Storage
Read-optimized databases, in depth
Proceedings of the VLDB Endowment
Proceedings of the VLDB Endowment
Online maintenance of very large random samples on flash storage
Proceedings of the VLDB Endowment
Modeling the performance of algorithms on flash memory devices
Proceedings of the 4th international workshop on Data management on new hardware
Fast scans and joins using flash drives
Proceedings of the 4th international workshop on Data management on new hardware
ICDE '09 Proceedings of the 2009 IEEE International Conference on Data Engineering
Join processing for flash SSDs: remembering past lessons
Proceedings of the Fifth International Workshop on Data Management on New Hardware
FAWN: a fast array of wimpy nodes
Proceedings of the ACM SIGOPS 22nd symposium on Operating systems principles
Column-oriented database systems
Proceedings of the VLDB Endowment
PR-join: a non-blocking join achieving higher early result rate with statistical guarantees
Proceedings of the 2010 ACM SIGMOD International Conference on Management of data
Pay-as-you-go: an adaptive approach to provide full context-aware text search over document content
Proceedings of the 2010 ACM SIGMOD International Conference on Management of data
Enhancing energy efficiency of database applications using SSDs
Proceedings of the Third C* Conference on Computer Science and Software Engineering
A development environment for query optimizers
Proceedings of the Third International Workshop on Testing Database Systems
Vertical partitioning for flash and HDD database systems
Journal of Systems and Software
On the impact of flash SSDs on spatial indexing
Proceedings of the Sixth International Workshop on Data Management on New Hardware
Flashing databases: expectations and limitations
Proceedings of the Sixth International Workshop on Data Management on New Hardware
StableBuffer: optimizing write performance for DBMS applications on flash devices
CIKM '10 Proceedings of the 19th ACM international conference on Information and knowledge management
Towards efficient concurrent scans on flash disks
DEXA'10 Proceedings of the 21st international conference on Database and expert systems applications: Part I
Tree indexing on solid state drives
Proceedings of the VLDB Endowment
Using solid state drives as a mid-tier cache in enterprise database OLTP applications
TPCTC'10 Proceedings of the Second TPC technology conference on Performance evaluation, measurement and characterization of complex systems
FAST'11 Proceedings of the 9th USENIX conference on File and stroage technologies
Architectural Requirements for Cloud Computing Systems: An Enterprise Cloud Approach
Journal of Grid Computing
Operation-aware buffer management in flash-based systems
Proceedings of the 2011 ACM SIGMOD International Conference on Management of data
Turbocharging DBMS buffer pool using SSDs
Proceedings of the 2011 ACM SIGMOD International Conference on Management of data
Data management over flash memory
Proceedings of the 2011 ACM SIGMOD International Conference on Management of data
A novel method to extend flash memory lifetime in flash-based DBMS
DASFAA'11 Proceedings of the 16th international conference on Database systems for advanced applications
Column-oriented query processing for row stores
Proceedings of the ACM 14th international workshop on Data Warehousing and OLAP
Towards cost-effective storage provisioning for DBMSs
Proceedings of the VLDB Endowment
Improving database performance using a flash-based write cache
DASFAA'12 Proceedings of the 17th international conference on Database Systems for Advanced Applications
A flash-based decomposition storage model
DASFAA'12 Proceedings of the 17th international conference on Database Systems for Advanced Applications
SI-CV: snapshot isolation with co-located versions
TPCTC'11 Proceedings of the Third TPC Technology conference on Topics in Performance Evaluation, Measurement and Characterization
Query processing on smart SSDs: opportunities and challenges
Proceedings of the 2013 ACM SIGMOD International Conference on Management of Data
Investigating hybrid SSD FTL schemes for Hadoop workloads
Proceedings of the ACM International Conference on Computing Frontiers
The impact of solid state drive on search engine cache management
Proceedings of the 36th international ACM SIGIR conference on Research and development in information retrieval
Scan and join optimization by exploiting internal parallelism of flash-based solid state drives
WAIM'13 Proceedings of the 14th international conference on Web-Age Information Management
MixSL: an efficient transaction recovery model in flash-based DBMS
WAIM'13 Proceedings of the 14th international conference on Web-Age Information Management
Can SSDs help reduce random i/os in hash joins?
WAIM'13 Proceedings of the 14th international conference on Web-Age Information Management
Scalable multi-access flash store for big data analytics
Proceedings of the 2014 ACM/SIGDA international symposium on Field-programmable gate arrays
Hi-index | 0.00 |
Solid state drives perform random reads more than 100x faster than traditional magnetic hard disks, while offering comparable sequential read and write bandwidth. Because of their potential to speed up applications, as well as their reduced power consumption, these new drives are expected to gradually replace hard disks as the primary permanent storage media in large data centers. However, although they may benefit applications that stress random reads immediately, they may not improve database applications, especially those running long data analysis queries. Database query processing engines have been designed around the speed mismatch between random and sequential I/O on hard disks and their algorithms currently emphasize sequential accesses for disk-resident data. In this paper, we investigate data structures and algorithms that leverage fast random reads to speed up selection, projection, and join operations in relational query processing. We first demonstrate how a column-based layout within each page reduces the amount of data read during selections and projections. We then introduce FlashJoin, a general pipelined join algorithm that minimizes accesses to base and intermediate relational data. FlashJoin's binary join kernel accesses only the join attributes, producing partial results in the form of a join index. Subsequently, its fetch kernel retrieves the attributes for later nodes in the query plan as they are needed. FlashJoin significantly reduces memory and I/O requirements for each join in the query. We implemented these techniques inside Postgres and experimented with an enterprise SSD drive. Our techniques improved query runtimes by up to 6x for queries ranging from simple relational scans and joins to full TPC-H queries.