Active storage networks for accelerating K-means data clustering

  • Authors:
  • Janardhan Singaraju;John A. Chandy

  • Affiliations:
  • Department of Electrical and Computer Engineering, University of Connecticut, Storrs, Connecticut;Department of Electrical and Computer Engineering, University of Connecticut, Storrs, Connecticut

  • Venue:
  • ARC'11 Proceedings of the 7th international conference on Reconfigurable computing: architectures, tools and applications
  • Year:
  • 2011

Quantified Score

Hi-index 0.00

Visualization

Abstract

High performance computing systems are often inhibited by the performance of their storage systems and their ability to deliver data. Active Storage Networks (ASN) provide an opportunity to optimize storage system and computational performance by offloading computation to the network switch. An ASN is based around an intelligent network switch that allows data processing to occur on data as it flows through the storage area network from storage nodes to client nodes. In this paper, we demonstrate an ASN used to accelerate K-means clustering. The K -means data clustering algorithm is a compute intensive scientific data processing algorithm. It is an iterative algorithm that groups a large set of multidimensional data points in to k distinct clusters. We investigate functional and data parallelism techniques as applied to the K-means clustering problem and show that the in-network processing of an ASN greatly improves performance.