Fine-grained updates in database management systems for flash memory

  • Authors:
  • Zhen He;Prakash Veeraraghavan

  • Affiliations:
  • Department of Computer Science and Computer Engineering, La Trobe University, Bundoora, VIC 3086, Australia;Department of Computer Science and Computer Engineering, La Trobe University, Bundoora, VIC 3086, Australia

  • Venue:
  • Information Sciences: an International Journal
  • Year:
  • 2009

Quantified Score

Hi-index 0.07

Visualization

Abstract

The growing storage capacity of flash memory (up to 640GB) and the proliferation of small mobile devices such as PDAs and mobile phones makes it attractive to build database management systems (DBMSs) on top of flash memory. However, most existing DBMSs are designed to run on hard disk drives. The unique characteristics of flash memory make the direct application of these existing DBMSs to flash memory very energy inefficient and slow. The relatively few DBMSs that are designed for flash suffer from two major short-comings. First, they do not take full advantage of the fact that updates to tuples usually only involve a small percentage of the attributes. A tuple refers to a row of a table in a database. Second, they do not cater for the asymmetry of write versus read costs of flash memory when designing the buffer replacement algorithm. In this paper, we have developed algorithms that address both of these short-comings. We overcome the first short-coming by partitioning tables into columns and then group the columns based on which columns are read or updated together. To this end, we developed an algorithm that uses a cost-based approach, which produces optimal column groupings for a given workload. We also propose a heuristic solution to the partitioning problem. The second short-coming is overcome by the design of the buffer replacement algorithm that automatically determines which page to evict from buffer based on a cost model that minimizes the expected read and write energy usage. Experiments using the TPC-C benchmark [S.T. Leutenegger, D. Dias, A modeling study of the TPC-C benchmark, in: Proceedings of ACM SIGMOD, 1993, pp. 22-31] show that our approach produces up to 40-fold in energy usage savings compared to the state-of-the-art in-page logging approach.