Active Disks for Large-Scale Data Processing

  • Authors:
  • Erik Riedel;Christos Faloutsos;Garth A. Gibson;David Nagle

  • Affiliations:
  • -;-;-;-

  • Venue:
  • Computer
  • Year:
  • 2001

Quantified Score

Hi-index 4.10

Visualization

Abstract

Active disk systems leverage the aggregate processing power of net-worked disks to offer increased processing throughput for large-scale data mining tasks. As processor performance increases and memory cost decreases, system intelligence continues to move away from the CPU and into peripherals. The authors propose using an active disk storage device that combines on-drive processing and memory with software downloadability to allow disks to execute application-level functions directly at the device.With active disks, application-specific functions access the excess computation power in drives. Active disks combine the requisite processing power of general-purpose disk-drive microprocessors with the special-purpose functionality of end-user programmability.The authors' experiment showed that active disks can accelerate an existing database system by moving data-intensive processing to the disks, thereby reducing the server CPU's processing load. The active disk approach eliminates the need for the PC processor, its memory subsystem, and I/O backplane, which makes active disks relatively inexpensive. The development of new data-intensive algorithms requires large amounts of disk space and high data-transfer rates for various processing tasks, with richer database structures, new content, new data sources, and novel applications for collected data. The authors contend that active disks can meet these needs while offering the parallelism available in large storage systems.