Grid-aware approach to data statistics, data understanding and data preprocessing

  • Authors:
  • Alexander Wohrer;Lenka Novakova;Peter Brezany;A. Min Tjoa

  • Affiliations:
  • Institute for Scientific Computing, Faculty of Computer Science, University of Vienna, Nordbergstrasse 15&#/#/47/C&#/#/47/3, 1090 Vienna, Austria.;Czech Technical University in Prague, Faculty of Electrical Engineering, Department of Cybernetics, Technicka 2, 166 27 Prague 6, Czech Republic.;Institute for Scienti&#/#/64257/c Computing, Faculty of Computer Science, University of Vienna, Nordbergstrasse 15&#/#/47/C&#/#/47/3, 1090 Vienna, Austria.;Institute of Software Technology, Vienna University of Technology, Favoritenstr. 9 –/ 11&#/#/47/188, 1040 Wien, Austria

  • Venue:
  • International Journal of High Performance Computing and Networking
  • Year:
  • 2009

Quantified Score

Hi-index 0.01

Visualization

Abstract

In recent years the focus of grid computing shifted towards more data intensive applications, increasingly needing access to various public and private databases. Relocating the code for Data Preprocessing (DPP) closer towards the data source is the overall task of the D³Gframework. This paper presents the data service side architecture to gather Data Statistics (DS) on-the-fly, use them in remote DPP methods on query results and gather exact continuous DS for whole tables inside a database. The performance results are showing low running costs for the continuous DS and the feasibility of the service side DPP functionality.