Probabilistic counting algorithms for data base applications
Journal of Computer and System Sciences
Randomized algorithms
The space complexity of approximating the frequency moments
STOC '96 Proceedings of the twenty-eighth annual ACM symposium on Theory of computing
The art of computer programming, volume 1 (3rd ed.): fundamental algorithms
The art of computer programming, volume 1 (3rd ed.): fundamental algorithms
Communication complexity
Fast Probabilistic Algorithms for Verification of Polynomial Identities
Journal of the ACM (JACM)
Even strongly universal hashing is pretty fast
SODA '00 Proceedings of the eleventh annual ACM-SIAM symposium on Discrete algorithms
Space-efficient online computation of quantile summaries
SIGMOD '01 Proceedings of the 2001 ACM SIGMOD international conference on Management of data
Models and issues in data stream systems
Proceedings of the twenty-first ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems
Maintaining stream statistics over sliding windows: (extended abstract)
SODA '02 Proceedings of the thirteenth annual ACM-SIAM symposium on Discrete algorithms
Incremental Cryptography: The Case of Hashing and Signing
CRYPTO '94 Proceedings of the 14th Annual International Cryptology Conference on Advances in Cryptology
XOR MACs: New Methods for Message Authentication Using Finite Pseudorandom Functions
CRYPTO '95 Proceedings of the 15th Annual International Cryptology Conference on Advances in Cryptology
Counting Distinct Elements in a Data Stream
RANDOM '02 Proceedings of the 6th International Workshop on Randomization and Approximation Techniques
A simple algorithm for finding frequent elements in streams and bags
ACM Transactions on Database Systems (TODS)
Maintaining variance and k-medians over data stream windows
Proceedings of the twenty-second ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems
What's hot and what's not: tracking most frequent items dynamically
Proceedings of the twenty-second ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems
Gigascope: a stream database for network applications
Proceedings of the 2003 ACM SIGMOD international conference on Management of data
Authentic data publication over the internet
Journal of Computer Security - IFIP 2000
Providing Database as a Service
ICDE '02 Proceedings of the 18th International Conference on Data Engineering
Nile: A Query Processing Engine for Data Streams
ICDE '04 Proceedings of the 20th International Conference on Data Engineering
Authenticating Query Results in Edge Computing
ICDE '04 Proceedings of the 20th International Conference on Data Engineering
Load Shedding for Aggregation Queries over Data Streams
ICDE '04 Proceedings of the 20th International Conference on Data Engineering
A General Model for Authenticated Data Structures
Algorithmica
Selective and Authentic Third-Party Distribution of XML Documents
IEEE Transactions on Knowledge and Data Engineering
Tracking set-expression cardinalities over continuous update streams
The VLDB Journal — The International Journal on Very Large Data Bases
Approximate counts and quantiles over sliding windows
PODS '04 Proceedings of the twenty-third ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems
Optimal approximations of the frequency moments of data streams
Proceedings of the thirty-seventh annual ACM symposium on Theory of computing
Multiple aggregations over data streams
Proceedings of the 2005 ACM SIGMOD international conference on Management of data
Verifying completeness of relational query results in data publishing
Proceedings of the 2005 ACM SIGMOD international conference on Management of data
An improved data stream summary: the count-min sketch and its applications
Journal of Algorithms
Summarizing and mining inverse distributions on data streams via dynamic inverse sampling
VLDB '05 Proceedings of the 31st international conference on Very large data bases
Characterizing and Exploiting Reference Locality in Data Stream Applications
ICDE '06 Proceedings of the 22nd International Conference on Data Engineering
Design of a novel statistics counter architecture with optimal space and time efficiency
SIGMETRICS '06/Performance '06 Proceedings of the joint international conference on Measurement and modeling of computer systems
Dynamic authenticated index structures for outsourced databases
Proceedings of the 2006 ACM SIGMOD international conference on Management of data
Window-aware load shedding for aggregation queries over data streams
VLDB '06 Proceedings of the 32nd international conference on Very large data bases
An integrated efficient solution for computing frequent and top-k elements in data streams
ACM Transactions on Database Systems (TODS)
Secure hierarchical in-network aggregation in sensor networks
Proceedings of the 13th ACM conference on Computer and communications security
Pseudo-random number generation for sketch-based estimations
ACM Transactions on Database Systems (TODS)
Monitoring streams: a new class of data management applications
VLDB '02 Proceedings of the 28th international conference on Very Large Data Bases
Approximate frequency counts over data streams
VLDB '02 Proceedings of the 28th international conference on Very Large Data Bases
How to summarize the universe: dynamic maintenance of quantiles
VLDB '02 Proceedings of the 28th international conference on Very Large Data Bases
Load shedding in a data stream manager
VLDB '03 Proceedings of the 29th international conference on Very large data bases - Volume 29
CADS: continuous authentication on data streams
VLDB '07 Proceedings of the 33rd international conference on Very large data bases
Proof-infused streams: enabling authentication of sliding window queries on streams
VLDB '07 Proceedings of the 33rd international conference on Very large data bases
Randomized Synopses for Query Assurance on Data Streams
ICDE '08 Proceedings of the 2008 IEEE 24th International Conference on Data Engineering
Verifying computations with streaming interactive proofs
Proceedings of the VLDB Endowment
Authentication of moving range queries
Proceedings of the 21st ACM international conference on Information and knowledge management
Towards relaxed selection and join queries over data streams
ADBIS'12 Proceedings of the 16th East European conference on Advances in Databases and Information Systems
Lightweight authentication of linear algebraic queries on data streams
Proceedings of the 2013 ACM SIGMOD International Conference on Management of Data
Hi-index | 0.00 |
Due to the overwhelming flow of information in many data stream applications, data outsourcing is a natural and effective paradigm for individual businesses to address the issue of scale. In the standard data outsourcing model, the data owner outsources streaming data to one or more third-party servers, which answer queries posed by a potentially large number of clients on the data owner's behalf. Data outsourcing intrinsically raises issues of trust, making outsourced query assurance on data streams a problem with important practical implications. Existing solutions proposed in this model all build upon cryptographic primitives such as signatures and collision-resistant hash functions, which only work for certain types of queries, for example, simple selection/aggregation queries. In this article, we consider another common type of queries, namely, “GROUP BY, SUM” queries, which previous techniques fail to support. Our new solutions are not based on cryptographic primitives, but instead use algebraic and probabilistic techniques to compute a small synopsis on the true query result, which is then communicated to the client so as to verify the correctness of the query result returned by the server. The synopsis uses a constant amount of space irrespective of the result size, has an extremely small probability of failure, and can be maintained using no extra space when the query result changes as elements stream by. We then generalize our synopsis to allow some tolerance on the number of erroneous groups, in order to support semantic load shedding on the server. When the number of erroneous groups is indeed tolerable, the synopsis can be strengthened so that we can locate and even correct these errors. Finally, we implement our techniques and perform an empirical evaluation using live network traffic.