B+ retake: sustaining high volume inserts into large data pages

  • Authors:
  • Kurt W. Deschler;Elke A. Rundensteiner

  • Affiliations:
  • Worcester Polytechnic Institute, Worcester, MA;Worcester Polytechnic Institute, Worcester, MA

  • Venue:
  • Proceedings of the 4th ACM international workshop on Data warehousing and OLAP
  • Year:
  • 2001

Quantified Score

Hi-index 0.00

Visualization

Abstract

Modern ad-hoc data mining queries often run on databases over a terabyte in size. At this scale, large data pages are required to obtain sufficient disk performance. Unfortunately, these large data pages greatly increase update costs, especially for packed structures such as the B+ tree. In a frequently updated warehouse, users are often forced to decide between query performance and update performance in order to meet maintenance time windows. Solutions that provide both are welcome.In this paper, we analyze and measure the memory related costs of B+ Tree updates with large data pages. We introduce the RB+ (Red-Black+) tree as a practical replacement for the B+ tree. The RB+ tree uses persistent red-black binary trees instead of sorted records for leaf pages. This organization improves memory performance up to 3,000% for updates and provides query performance comparable to a B+ tree, making it practical for large, frequently updated warehouses.