Non-linear data stream compression: foundations and theoretical results

Authors:
Alfredo Cuzzocrea;Hendrik Decker
Affiliations:
ICAR-CNR and University of Calabria, Cosenza, Italy and Instituto Tecnológico de Informática, Universidad Politécnica de Valencia, Valencia, Spain;ICAR-CNR and University of Calabria, Cosenza, Italy and Instituto Tecnológico de Informática, Universidad Politécnica de Valencia, Valencia, Spain
Venue:
HAIS'12 Proceedings of the 7th international conference on Hybrid Artificial Intelligent Systems - Volume Part I
Year:
2012

Citing 24
Cited 1

Random sampling with a reservoir

ACM Transactions on Mathematical Software (TOMS)
Range queries in OLAP data cubes

SIGMOD '97 Proceedings of the 1997 ACM SIGMOD international conference on Management of data
On random sampling over joins

SIGMOD '99 Proceedings of the 1999 ACM SIGMOD international conference on Management of data
Mining high-speed data streams

Proceedings of the sixth ACM SIGKDD international conference on Knowledge discovery and data mining
The Quadtree and Related Hierarchical Data Structures

ACM Computing Surveys (CSUR)
On computing correlated aggregates over continual data streams

SIGMOD '01 Proceedings of the 2001 ACM SIGMOD international conference on Management of data
Space-efficient online computation of quantile summaries

SIGMOD '01 Proceedings of the 2001 ACM SIGMOD international conference on Management of data
Data-streams and histograms

STOC '01 Proceedings of the thirty-third annual ACM symposium on Theory of computing
Fast, small-space algorithms for approximate histogram maintenance

STOC '02 Proceedings of the thiry-fourth annual ACM symposium on Theory of computing
Processing complex aggregate queries over data streams

Proceedings of the 2002 ACM SIGMOD international conference on Management of data
Continuous queries over data streams

ACM SIGMOD Record
Data streams: algorithms and applications

SODA '03 Proceedings of the fourteenth annual ACM-SIAM symposium on Discrete algorithms
Clustering Data Streams: Theory and Practice

IEEE Transactions on Knowledge and Data Engineering
One-Pass Wavelet Decompositions of Data Streams

IEEE Transactions on Knowledge and Data Engineering
MAIDS: mining alarming incidents from data streams

SIGMOD '04 Proceedings of the 2004 ACM SIGMOD international conference on Management of data
On demand classification of data streams

Proceedings of the tenth ACM SIGKDD international conference on Knowledge discovery and data mining
Overcoming Limitations of Approximate Query Answering in OLAP

IDEAS '05 Proceedings of the 9th International Database Engineering & Application Symposium
Research issues in data stream association rule mining

ACM SIGMOD Record
MavEStream: Synergistic Integration of Stream and Event Processing

ICDT '07 Proceedings of the Second International Conference on Digital Telecommunications
Approximate frequency counts over data streams

VLDB '02 Proceedings of the 28th international conference on Very Large Data Bases
Optimizing Complex Event Processing over RFID Data Streams

ICDE '08 Proceedings of the 2008 IEEE 24th International Conference on Data Engineering
Intelligent Techniques for Warehousing and Mining Sensor Network Data

Intelligent Techniques for Warehousing and Mining Sensor Network Data
Event-based lossy compression for effective and efficient OLAP over data streams

Data & Knowledge Engineering
Editorial: Editorial: New trends and applications on hybrid artificial intelligence systems

Neurocomputing

Managing uncertainty in databases and scaling it up to concurrent transactions

SUM'12 Proceedings of the 6th international conference on Scalable Uncertainty Management

Quantified Score

Hi-index	0.00

Visualization

Abstract

In this paper, we provide foundations and theoretical results of a novel paradigm for supporting data stream miming algorithms effectively and efficiently, the so-called non-linear data stream compression model. Particularly, the proposed model falls in that class of data stream mining applications where interesting knowledge is extracted via suitable collections of OLAP queries from data streams, being latter ones baseline operations of complex knowledge discovery tasks over data streams implemented by ad-hoc data stream mining algorithms. Here, a fortunate line of research consists in admitting approximate, i.e. compressed, representation models and query/mining results at the benefit of a more efficient and faster computation. On top of this main assumption, the proposed non-linear data stream compression model pursues the idea of maintaining a lower degree of approximation (thus, as a consequence, a higher query error) for aggregate information on those data stream readings related to interesting events, and, by contrast, a higher degree of approximation (thus, as a consequence, a lower query error) for aggregate information on other data stream readings, i.e. readings not related to any particular event, or related to low-interesting events.