The impact of poor data quality on the typical enterprise
Communications of the ACM
Data preparation for data mining
Data preparation for data mining
Data Mining and Knowledge Discovery
Decision Tables: Scalable Classification Exploring RDBMS Capabilities
VLDB '00 Proceedings of the 26th International Conference on Very Large Data Bases
Novel mediator architectures for Grid information systems
Future Generation Computer Systems
General purpose database summarization
VLDB '05 Proceedings of the 31st international conference on Very large data bases
A survey of data provenance in e-science
ACM SIGMOD Record
How to summarize the universe: dynamic maintenance of quantiles
VLDB '02 Proceedings of the 28th international conference on Very Large Data Bases
Knowledge grid support for treatment of traumatic brain injury victims
ICCSA'03 Proceedings of the 2003 international conference on Computational science and its applications: PartI
Databases in grid applications: locality and distribution
BNCOD'05 Proceedings of the 22nd British National conference on Databases: enterprise, Skills and Innovation
Hi-index | 0.01 |
In recent years the focus of grid computing shifted towards more data intensive applications, increasingly needing access to various public and private databases. Relocating the code for Data Preprocessing (DPP) closer towards the data source is the overall task of the D³Gframework. This paper presents the data service side architecture to gather Data Statistics (DS) on-the-fly, use them in remote DPP methods on query results and gather exact continuous DS for whole tables inside a database. The performance results are showing low running costs for the continuous DS and the feasibility of the service side DPP functionality.