A Services Oriented Framework for Next Generation Data Analysis Centers

  • Authors:
  • H. Wang;A. Ghoting;G. Buehrer;S. Tatikonda;S. Parthasarathy;T. Kurc;J. Saltz

  • Affiliations:
  • The Ohio State University;The Ohio State University;The Ohio State University;The Ohio State University;The Ohio State University;The Ohio State University;The Ohio State University

  • Venue:
  • IPDPS '05 Proceedings of the 19th IEEE International Parallel and Distributed Processing Symposium (IPDPS'05) - Workshop 10 - Volume 11
  • Year:
  • 2005

Quantified Score

Hi-index 0.02

Visualization

Abstract

Over the past decade, advances in computational and sensor technology have enabled us to dynamically collect vast amounts of data from observations, health screening tests, simulations, and experiments at an ever-increasing pace. Knowledge discovery and data mining is an iterative process concerned with deriving interesting, non-obvious, and useful patterns and models from such large volumes of data. Although inexpensive storage is conducive to maintaining said data, accessing and managing it for knowledge discovery and data mining becomes a performance issue when datasets are large, dynamic, and distributed. In this work, we present our vision of a software framework consisting of middleware services to support interactive data mining over dynamic data at data analysis centers built on top of heterogeneous clusters. The design of a sampling service for dynamic data, together with initial performance results, are also presented.