Frequency-based load shedding over a data stream of tuples

Authors:
Joong Hyuk Chang;Hye-Chung (Monica) Kum
Affiliations:
Dept. of Computer Science and Engineering, Wright State University, USA;Dept. of Computer Science, University of North Carolina at Chapel Hill, USA
Venue:
Information Sciences: an International Journal
Year:
2009

Citing 19
Cited 3

A tree projection algorithm for generation of frequent item sets

Journal of Parallel and Distributed Computing - Special issue on high-performance data mining
Mining a stream of transactions for customer patterns

Proceedings of the seventh ACM SIGKDD international conference on Knowledge discovery and data mining
Models and issues in data stream systems

Proceedings of the twenty-first ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems
Maintaining stream statistics over sliding windows: (extended abstract)

SODA '02 Proceedings of the thirteenth annual ACM-SIAM symposium on Discrete algorithms
Real-Time Database Systems: Architecture and Techniques

Real-Time Database Systems: Architecture and Techniques
Aurora: a new model and architecture for data stream management

The VLDB Journal — The International Journal on Very Large Data Bases
BIDE: Efficient Mining of Frequent Closed Sequences

ICDE '04 Proceedings of the 20th International Conference on Data Engineering
Load Shedding for Aggregation Queries over Data Streams

ICDE '04 Proceedings of the 20th International Conference on Data Engineering
Static optimization of conjunctive queries with sliding windows over infinite streams

SIGMOD '04 Proceedings of the 2004 ACM SIGMOD international conference on Management of data
Loadstar: load shedding in data stream mining

VLDB '05 Proceedings of the 31st international conference on Very large data bases
The CQL continuous query language: semantic foundations and query execution

The VLDB Journal — The International Journal on Very Large Data Bases
Finding recently frequent itemsets adaptively over online transactional data streams

Information Systems
Frequent Closed Sequence Mining without Candidate Maintenance

IEEE Transactions on Knowledge and Data Engineering
Approximate frequency counts over data streams

VLDB '02 Proceedings of the 28th international conference on Very Large Data Bases
Load shedding in a data stream manager

VLDB '03 Proceedings of the 29th international conference on Very large data bases - Volume 29
Processing sliding window multi-joins in continuous queries over data streams

VLDB '03 Proceedings of the 29th international conference on Very large data bases - Volume 29
CPU load shedding for binary stream joins

Knowledge and Information Systems
Attribute-based evaluation of multiple continuous queries for filtering incoming tuples of a data stream

Information Sciences: an International Journal
A false negative approach to mining frequent itemsets from high speed transactional data streams

Information Sciences: an International Journal

Anomaly intrusion detection by clustering transactional audit streams in a host computer

Information Sciences: an International Journal
Distributed construction of data cubes from tuple stream

Proceedings of the 12th International Conference on Information Integration and Web-based Applications & Services
Distributed construction of data cubes from tuple stream

International Journal of Business Intelligence and Data Mining

Quantified Score

Hi-index	0.07

Visualization

Abstract

Usually the data generation rate of a data stream is unpredictable, and some data elements of the data stream cannot be processed in real time if the generation rate exceeds the capacity of a data stream processing algorithm. In order to overcome this situation gracefully, a load shedding technique is recommended. This paper proposes a frequency-based load shedding technique over a data stream of tuples. In many data stream processing applications, such as mining frequent patterns, data elements having high frequency can be considered more significant than others having low frequency. Based on this observation, in the proposed technique, only frequent elements of a data stream are processed in real time while the others are trimmed. The decision to shed a load from the data stream or not is controlled automatically by the data generation rate of a data stream. Consequently, an unnecessary load shedding operation is not allowed in the proposed technique.