Communication complexity
The space complexity of approximating the frequency moments
Journal of Computer and System Sciences
Congressional samples for approximate answering of group-by queries
SIGMOD '00 Proceedings of the 2000 ACM SIGMOD international conference on Management of data
Data Exchange: Semantics and Query Answering
ICDT '03 Proceedings of the 9th International Conference on Database Theory
Using Datacube Aggregates for Approximate Querying and Deviation Detection
IEEE Transactions on Knowledge and Data Engineering
Data streams: algorithms and applications
Foundations and Trends® in Theoretical Computer Science
Scalable approximate query processing with the DBO engine
Proceedings of the 2007 ACM SIGMOD international conference on Management of data
Approximate frequency counts over data streams
VLDB '02 Proceedings of the 28th international conference on Very Large Data Bases
Finding hierarchical heavy hitters in data streams
VLDB '03 Proceedings of the 29th international conference on Very large data bases - Volume 29
Materialized Sample Views for Database Approximation
IEEE Transactions on Knowledge and Data Engineering
Fixed-Precision Approximate Continuous Aggregate Queries in Peer-to-Peer Databases
ICDE '08 Proceedings of the 2008 IEEE 24th International Conference on Data Engineering
CAMS: OLAPing Multidimensional Data Streams Efficiently
DaWaK '09 Proceedings of the 11th International Conference on Data Warehousing and Knowledge Discovery
Proceedings of the 13th International Conference on Database Theory
Continuous sampling for online aggregation over multiple queries
Proceedings of the 2010 ACM SIGMOD International Conference on Management of data
ADBIS'10 Proceedings of the 14th east European conference on Advances in databases and information systems
ICDT'07 Proceedings of the 11th international conference on Database Theory
Continuous sampling from distributed streams
Journal of the ACM (JACM)
A functional model for data analysis
FQAS'06 Proceedings of the 7th international conference on Flexible Query Answering Systems
Proceedings of the 21st ACM international conference on Information and knowledge management
Hi-index | 0.00 |
We study streaming data for a data warehouse, which combines different sources. We consider the relative answers to OLAP queries on a schema, as distributions with the L1 distance and approximate the answers without storing the entire data warehouse. We first study how to sample each source and combine the samples to approximate any OLAP query. We then consider a streaming context, where a data warehouse is built by streams of different sources. We first show a lower bound on the size of the memory necessary to approximate queries and then consider a statistical hypothesis where some attributes determine fixed distributions of the measure. We use the sampling methods to learn the statistical model and approximate OLAP queries. In this case, we approximate OLAP queries with a finite memory. We apply the method to a dataset which simulates the data of sensors, which provide weather parameters over time and locations from different sources.