Supporting self-adaptation in streaming data mining applications

Authors:
Liang Chen;Gagan Agrawal
Affiliations:
Department of Computer Science and Engineering, Ohio State University, Columbus, OH;Department of Computer Science and Engineering, Ohio State University, Columbus, OH
Venue:
IPDPS'06 Proceedings of the 20th international conference on Parallel and distributed processing
Year:
2006

Citing 21
Cited 5

Agile application-aware adaptation for mobility

Proceedings of the sixteenth ACM symposium on Operating systems principles
Space-time memory: a parallel programming abstraction for interactive multimedia applications

Proceedings of the seventh ACM SIGPLAN symposium on Principles and practice of parallel programming
The AppLeS parameter sweep template: user-level middleware for the grid

Proceedings of the 2000 ACM/IEEE conference on Supercomputing
Time-critical multiresolution volume rendering using 3D texture mapping hardware

VVS '02 Proceedings of the 2002 IEEE symposium on Volume visualization and graphics
Cooperative run-time management of adaptive applications and distributed resources

Proceedings of the tenth ACM international conference on Multimedia
Fast Algorithms for Mining Association Rules in Large Databases

VLDB '94 Proceedings of the 20th International Conference on Very Large Data Bases
Dynamic Support for Distributed Auto-Adaptive Applications

ICDCSW '02 Proceedings of the 22nd International Conference on Distributed Computing Systems
Merging multiple data streams on common keys over high performance networks

Proceedings of the 2002 ACM/IEEE conference on Supercomputing
Dynamic Querying of Streaming Data with the dQUOB System

IEEE Transactions on Parallel and Distributed Systems
A high performance multi-perspective vision studio

ICS '03 Proceedings of the 17th annual international conference on Supercomputing
Automatic Configuration and Run-time Adaptation of Distributed Applications

HPDC '00 Proceedings of the 9th IEEE International Symposium on High Performance Distributed Computing
Leveraging Run Time Knowledge about Event Rates to Improve Memory Utilization in Wide Area Data Stream Filtering

HPDC '02 Proceedings of the 11th IEEE International Symposium on High Performance Distributed Computing
Partitionable Services: A Framework for Seamlessly Adapting Distributed Applications to Heterogeneous Environments

HPDC '02 Proceedings of the 11th IEEE International Symposium on High Performance Distributed Computing
ACDS: Adapting Computational Data Streams for High Performance

IPDPS '00 Proceedings of the 14th International Symposium on Parallel and Distributed Processing
Program control language: a programming language for adaptive distributed applications

Journal of Parallel and Distributed Computing
Roam, a seamless application framework

Journal of Systems and Software - Special issue: Ubiquitous computing
GATES: A Grid-Based Middleware for Processing Distributed Data Streams

HPDC '04 Proceedings of the 13th IEEE International Symposium on High Performance Distributed Computing
Language and Compiler Support for Adaptive Applications

Proceedings of the 2004 ACM/IEEE conference on Supercomputing
Finding (Recently) Frequent Items in Distributed Data Streams

ICDE '05 Proceedings of the 21st International Conference on Data Engineering
A framework for clustering evolving data streams

VLDB '03 Proceedings of the 29th international conference on Very large data bases - Volume 29
Stampede: a cluster programming middleware for interactive stream-oriented applications

IEEE Transactions on Parallel and Distributed Systems

Supporting dynamic migration in tightly coupled grid applications

Proceedings of the 2006 ACM/IEEE conference on Supercomputing
Adaptive communal detection in search of adversarial identity crime

Proceedings of the 2007 international workshop on Domain driven data mining
Integration of sensing and computing in an intelligent decision support system for homeland security defense

Pervasive and Mobile Computing
Design principles for developing stream processing applications

Software—Practice & Experience - Focus on Selected PhD Literature Reviews in the Practical Aspects of Software Technology
Supporting a real-time distributed intrusion detection application on GATES

Euro-Par'06 Proceedings of the 12th international conference on Parallel Processing

Quantified Score

Hi-index	0.00

Visualization

Abstract

There are many application classes where the users are flexible with respect to the output quality. At the same time, there are other constraints, such as the need for real-time or interactive response, which are more crucial. This paper presents and evaluates a runtime algorithm for supporting adaptive execution for such applications. The particular domain we target is distributed data mining on streaming data. This work has been done in the context of a middleware system called GATES (Grid-based AdapTive Execution on Streams) that we have been developing. The self-adaptation algorithm we present and evaluate in this paper has the following characteristics. First, it carefully evaluates the longterm load at each processing stage. It considers different possibilities for the load at a processing stage and its next stages, and decides if the value of an adaptation parameter needs to be modified, and if so, in which direction. To find the ideal new value of an adaptation parameter, it performs a binary search on the specified range of the parameter. To evaluate the self-adaptation algorithm in our middleware, we have implemented two streaming data mining applications. The main observations from our experiments are as follows. First, our algorithm is able to quickly converge to stable values of the adaptation parameter, for different data arrival rates, and independent of the specified initial value. Second, in a dynamic environment, the algorithm is able to adapt the processing rapidly. Finally, in both static and dynamic environments, the algorithm clearly outperforms the algorithm described in our earlier work and an obvious alternative, which is based on linear-updates.