Data page layouts for relational databases on deep memory hierarchies

  • Authors:
  • Anastassia Ailamaki;David J. DeWitt;Mark D. Hill

  • Affiliations:
  • School of Computer Science, Carnegie Mellon University, 5000 Forbes Avenue, Pittsburg, PA 15213-3891, USA/ e-mail: natassa&commat/cmu.edu;Department of Computer Science, University of Wisconsin-Madison, 1210 West Dayton Street, Madison, WI 53706-1685, USA/ e-mail: &lcub/dewitt, markhill&rcub/&commat/cs.wisc.edu;Department of Computer Science, University of Wisconsin-Madison, 1210 West Dayton Street, Madison, WI 53706-1685, USA/ e-mail: &lcub/dewitt, markhill&rcub/&commat/cs.wisc.edu

  • Venue:
  • The VLDB Journal — The International Journal on Very Large Data Bases
  • Year:
  • 2002

Quantified Score

Hi-index 0.00

Visualization

Abstract

Relational database systems have traditionally optimized for I/Operformance and organized records sequentially on disk pages usingthe N-ary Storage Model (NSM) (a.k.a., slotted pages). Recentresearch, however, indicates that cache utilization and performanceis becoming increasingly important on modern platforms. In thispaper, we first demonstrate that in-page data placement is the keyto high cache performance and that NSM exhibits low cacheutilization on modern platforms. Next, we propose a new dataorganization model called PAX (Partition Attributes Across), thatsignificantly improves cache performance by grouping together allvalues of each attribute within each page. Because PAX only affectslayout inside the pages, it incurs no storage penalty and does notaffect I/O behavior. According to our experimental results (whichwere obtained without using any indices on the participatingrelations), when compared to NSM: (a) PAX exhibits superior cacheand memory bandwidth utilization, saving at least 75% of NSM'sstall time due to data cache accesses; (b) range selection queriesand updates on memory-resident relations execute 1725% faster; and(c) TPC-H queries involving I/O execute 1148% faster. Finally, weshow that PAX performs well across different memory system designs.